A broad trend in technology that’s been spoken about at length is the increase in data that our systems have to handle, and an immediate corollary is a commensurate increase in the amount of data and logs our systems produce. This is so oft stated, it’s become a cliché, but the fact remains that as our systems process more information, from more end points, and become disaggregated into more components (i.e., microservices), our methods for capturing and analyzing that data must evolve.
As a result, the software ecosystem has been moving quickly to keep pace — so much so it can be a real challenge to keep track of all the progress. Yet at times there is a project that quickly becomes so pervasive it demands attention.
Druid was such a project. We first started hearing about it last year and were overwhelmed by the breadth and pace of adoption it had. It was being used by our portfolio companies (e.g., Airbnb), by large public vendors (e.g., Cisco), by web giants (e.g., Alibaba, eBay, Netflix), and media organizations (e.g., Condé Nast). As a high-performance online analytical processing (OLAP) database, it has been scaled to petabytes of data and trillions of events, ingesting millions of events every second.
While Druid is being used across a wide range of use cases, the most prevalent are (near) real-time analytics for operations and business. Traditionally, the massive increase in data has outpaced our ability to process it, so in many companies it has been dumped into data lakes without the much-needed ability for timely access in key moments such as responding to a critical system event or a change in customer sentiment. Druid, on the other hand, is a streaming database that can ingest massive amounts of data and support sophisticated sub-second analytic queries such as grouping, filtering, and data aggregation, which finally allows organizations to draw complex insights and respond to them as they happen.
Druid was developed at Metamarkets by a team that included Fangjin (“FJ”) Yang, Vadim Ogievetsky, and Gian Merlino, who then spun out to create the company Imply. While Druid has clearly been a phenomenon in its own right, our expectations were exceeded again when we started to get to know the Imply team and the progress they’ve made. FJ was the one who led the push to open-source Druid and the main force behind its initial traction in the community. Vadim was one of the co-authors of d3.js while pursuing his master’s work at Stanford University; and Gian was the one who worked on Druid’s data ingestion pipeline, helping to improve the scalability to where it is today. While at Imply, with very little funding and no sales people, they’ve been able to cultivate an exceptionally impressive customer list with deep engagements and hugely positive feedback from their customer base.
Today we’re excited to announce that we led the Series A investment in the company. Getting to know the team and the space has been a real eye-opener. Everyone talks about how businesses are consuming more and more data, and are under pressure to process it in real time. Operational analytics is a real movement and Druid is a key driver accelerating it. We look forward to working with Imply to offer the best enterprise solutions and services in the space.