The Next Layer of the Data Stack: Coordinating Data State

The Entire Data Stack Is Becoming More Modular. Now Someone Has to Coordinate It.

Modern data infrastructure has largely solved execution. Storage is standardizing around open table formats, and query engines are increasingly modular and interchangeable. As a result, the hardest problems are shifting away from running workloads and toward coordinating evolving data state across systems, workflows, and consumers. 

The biggest change in how products and workflows have evolved is that data has moved from a downstream analytics artifact to a core part of production systems. Data is no longer just an analytics artifact; it increasingly powers production software, AI systems, and user-facing features. This convergence is pushing data engineering and software engineering toward a shared runtime and ownership model. The question now is not how data is executed, but who controls and manages the state that everything depends on.

We have attempted to explore how the data stack has evolved, the coordination challenges emerging from modular systems, and the areas of infrastructure we believe are most interesting going forward. 

Thanks to all the operators and founders who shared feedback and perspectives along the way. We’d also love to hear your thoughts. If you’re building in this space or thinking about similar problems, please reach out to Taha Mubashir and/or Ayush Malhotra.

INDUSTRY INSIGHTS

The Next Layer of the Data Stack: Coordinating Data State