Vertical clouds are on the rise, as traditional clouds give way to specialization.
The infrastructure narrative of the last 15 years has been dominated by the cloud. Its growth has driven a surge of innovation in everything from hardware, to software, to operations – and resulted in the largest migration of workloads in the history of the industry. A handful of hyperscalers have driven this growth, and their impact is hard to overstate. They dominate the talent pool. They dominate market cap growth. And they dominate global IT spend, with the top few spending more that $5B in 2019 (and at least one over $10B). As a result, they also have outsized influence on the entire supply chain from chips to software, which they now use to drive agendas confined to their (relatively) narrow view of how infrastructure should look.
From this very well-earned position of dominance, they’ve convinced large swaths of the industry that they are the only companies who can build scalable and cost-effective cloud infrastructure upon which the rest of the industry should be content building applications. (Until, of course, they decide to launch competitive services as a matter of course.)
However, like all architectural epochs, the large centralized clouds are just another waypoint on the long arc of IT infrastructure. And their extreme success is now resulting in yet another shift that is naturally pushing workloads back out to more specialized infrastructure. And that trend is the focus of this post.
This shift, by the way, is in no way any failing on the part of the traditional clouds. In fact, they’re still relatively early in their journey of consuming traditional IT, and will continue to do that for decades – growingly significantly along the way. They are doing exactly what they should, building a very general offering to support the long tail of IT applications.
However, for the most part, to offer a higher-level resource, they end up just bundling the usual low-level resources in new ways — and they optimize them for the mean workload. As a result, the trad clouds are often too general, too inflexible, and their services too shallow. And so a new crop of infrastructure companies are being built to fill the growing need around rich, vertically integrated services. Or vertical clouds, as we like to call them.
Vertical clouds, which are entirely focused on a specific type of workload or cloud service, tend to be far more sophisticated, far more cost effective, and far more performant. And while they may be built on top of the trad, centralized clouds to start with, more and more we’re seeing them using special purpose physical infrastructure too.
These companies are now able to do this because (a) cloud applications are increasingly loosely coupled, allowing their developers to pick and choose what cloud infrastructure services they use, and perhaps most significantly, because (b) the size of the cloud has reached a point where the markets for these individual cloud services are large enough to support large, viable independent companies.
And so, in many ways, we’re entering a new and incredibly exciting era of infrastructure, in which any infrastructure service (and really any common sub-component of an application) is fair game to build a company around as a verticalized cloud. The better you are at building the infrastructure, the better the service will be. And because the market is large enough to sustain this, the large central clouds are structurally disadvantaged to compete. The primary question to startups in infrastructure has ironically shifted from “what if AWS/GCP/Azure decides to compete with you?” to “why aren’t you competing more directly with AWS/GCP/Azure?”
To pull away from the abstract, let’s look at a few areas where this is concretely changing how infrastructure companies are being built as used. We’ve selected three, but there are many many others (and given time we’d love to write those up too):
TABLE OF CONTENTS
Interactive web apps clearly benefit from compute being closer to the user. And because the back end is now being abstracted as a set of services, the teams that implement them can push them closer to the user by taking advantage of the CDN tier. And that’s exactly what we’re seeing happen now. Many new services, from databases, to rendering engines, to collaboration tools, are being implemented as workloads and run on next-generation CDNs such as Fly.io and CloudFlare; this provides both power to the front end and performance to the users. Naturally, Fly and CloudFlare build their own infrastructure, because ultimately that’s the best way to provide low cost and good performance, if you have the ability to do so.
AI is driving new hardware build-outs: Different workloads benefit from different configurations of hardware and software. So much so that running a workload on a platform not tuned for it can result in orders of magnitudes of loss in cost and performance. While there are many classes of workloads that both fall in this category and are popular across the cloud user base, perhaps the most notable of those are AI workloads. It’s well known that Facebook, Google, Microsoft, and many other companies have built bespoke clusters for AI training.
AI workloads differ dramatically from the traditional cloud applications built around web servers and databases. A specialized cloud not only has different silicon built for AI/ML computation, but also a different scheduler, network interconnect, modes for managing failure, and many other design aspects tuned for this purpose. Given all this, it’s little wonder that over the last few years we’ve seen AI-focused clouds such as MosaicML grow in popularity.
This isn’t to say that the trad clouds can’t provide AI-focused services. All of the popular ones do. But it’s yet another service, sitting alongside hundreds of others. And so for those users for whom AI workload cost and performance are paramount, it’s understandable that they’re increasingly exploring other options.
App platforms going full stack: The previous two examples are horizontal infrastructure components that sit well below the app layer. However, we’re also starting to see the trend move to higher-level app platforms. One of our favorite examples of this is the work being done at Mighty. Mighty runs the browser as a service in the cloud to make web apps faster (loading, interactivity, stacking, etc.). The team chose specific hardware and integrated it with its own Chromium browser to provide an experience that’s orders of magnitudes better than traditional public cloud offerings for Remote Desktop software, while rivaling a native laptop experience.
—
Before we wrap it, it’s worth noting that a lot of both technical and business innovation over the last decade is helping enable this trend. Technically, it’s easier than ever to stand up a bare metal hardware offering. And the hardware vendors have adapted their business models to allow for full OpEx rather than CapEx.
We also want to make the point that the focus of this piece is on infrastructure companies that are built by infrastructure teams whose expertise is, well, infrastructure. There are many such teams that are fully capable of building scalable systems from hardware on up. In the past, the markets simply weren’t large enough to sustain this level of specialization. But they are now. And a new era of infrastructure is following. And we’ll all be better for it.