Posted February 21, 2023

It’s very rare to see a new building block emerge in computing. If aliens landed on earth and decompiled our software, every app would look roughly the same: some combination of networking, storage, and compute. The way developers consume these resources, and where they are housed, has of course changed dramatically over time. But the core concepts are as old as computing itself, dating back as far as the abacus (~2700 BCE), the Analytical Engine (1837), and the SAGE radar network (1950s).

Large AI models like Stable Diffusion and ChatGPT represent a fundamentally new building block. By integrating large models (LMs) into software, developers can expose functionality that wouldn’t be possible otherwise, including generating visual or text content, classifying existing content, and drawing semantic (rather than formal) connections in data. This may be one of the biggest changes in software we’ve ever seen — it’s not just running software on a new platform (e.g. a mobile device), but is a net new type of software.

The only problem is that LMs are still hard to use. Most developers are not machine learning engineers — globally, software engineers outnumber machine learning engineers by roughly two orders of magnitude (~30 million versus ~500,000). Large-scale pre-training has made AI dramatically more accessible, but software developers still face a series of hurdles (e.g. where to host models, what to do when they break, and how to build model differentiation over time) to get AI apps running in production, especially at scale. Clean abstractions and simple tools for LMs simply don’t exist. 

This is the problem Replicate aims to solve, by being something like the Vercel of machine learning. We’re excited to announce today that we are leading Replicate’s Series A round to help the company grow and achieve their vision to make AI usable at scale.

The core tenet of Replicate’s product is that all open source AI models should be available, and easy to use, in one place. Developers should be able to get up and running on LMs with zero machine learning work, hosting setup, or inscrutable Python/CUDA errors. It should be easy to compose several models into a pipeline. And, as apps scale up, developers should have access to simple tools for fine tuning and hosting their own models.

This is all possible because Replicate focuses only on developer experience and general abstractions — in contrast to model providers that are tied to single-model architectures and spend most of their resources to develop better models.

So far, Replicate has attracted thousands of active developers to the platform, many of them building visual generative AI apps. Some of the most sophisticated and well-known AI companies are using Replicate. We view this as early validation that even highly capable developers don’t want to reinvent the wheel, and that Replicate is building the right product for this audience.

The Replicate team is uniquely equipped to tackle this problem. Ben Firshman designed the first version of Docker Compose, a tool now used by millions of developers, and has a superpower for understanding developer experience. Andreas Jansson was a senior machine learning engineer at Spotify, where he deployed large-scale production AI models and designed new AI tooling from scratch. Together, they released Cog, a simple, container-based model packaging system that now powers Replicate.

We’re only beginning to see the power of large models as a new building block in software. We think Replicate has an important role to play in getting these models into the hands of the next million developers, and we’re thrilled to support them in this mission.