Blog
by Pete SonsiniNov 19, 2020
One of the biggest shifts we’re seeing in the data ecosystem is the rise of the analytics engineer. These are data practitioners who are used to working with SQL and BI platforms. Increasingly however, they own the model creation, data extraction, transformation, and flow, and they are building systems around their practice to have good CI, version control, and documentation. The craft of analytics is progressing and becoming more like software development, and we’re excited to invest in great tools and platforms that aid this evolution.
The analytics space has produced some of the largest tech companies in the last decade. We try to get primary intel about gaps in the market and look for opportunities to invest in. We talk to lots of relevant folks and follow the pain.
Earlier this year, we spoke to 70 data practitioners about the biggest challenges they face in their job. Overwhelmingly, the most common response was something about data quality. They didn’t all use the same vocabulary to describe this pain, but it was clear to us that there is a hole in the data analytics stack. Analysts want to do more, but feel limited by their tools. They are constantly plagued by challenges of data “integrity”, “cleanliness”, and “truth”.
We first connected with Gleb Mezhanskiy, Co-founder and CEO of Datafold, in May. We saw how he was advancing the discussion of what data quality actually means and that he was creating a community where analysts could describe what they had tried, where they had succeeded, and where they had fallen down. Gleb has also published a lot of his thoughts on the modern data stack, and we are continually impressed with his leadership.
Datafold is starting with a simple product– Data Diff. It does one thing really well: it allows analysts to see clearly how their changes to code impact data downstream, providing some visibility and assurances that things are working as expected—or not. This reminded us of what Sentry, another portfolio company, did for software developers. Sentry provides the best crash analytics and error reporting on the market. They’re tightly integrated into the developer workflow and enable developers to ship higher quality code. Observability and monitoring platforms are a must-have for software apps and infrastructure, but the analogous tools are missing for data.
No longer. Analytics engineers are the heroes of their data teams and organizations, bringing engineering-inspired best practices to their jobs. We heard loud and clear that data quality is important, and Datafold is equipping analytics engineers with the tools to address the observability piece. They’re just at the beginning, but building quickly with the community in mind. We’re thrilled to lead Datafold’s seed financing, and look forward to partnering with Gleb and the team on the next chapter of data analytics tools.
Learn more about the team and vision here.