American Dynamism

Toward a Horizontal Robotics Platform

Oliver Hsu Posted August 26, 2024

Toward a Horizontal Robotics Platform Table of Contents

AI is finally beginning to fulfill its massive transformative potential, as evidenced by the spate of new AI-enabled products across text, images, video, audio, and more. But as far as production-ready products go, one modality has thus far been notably less present from this ongoing Cambrian explosion of AI: physical actions.

The types of physical actions generally performed by robots have largely been trapped within the confines of Moravec’s paradox and have seen nowhere near the pace of advancement as in other modalities. There are, of course, understandable reasons for this, ranging from the difficult unit economics of physical automation solutions to the challenges of delivering on correctness for the long tail of physical tasks.

Over the last two years, however, there has been a meaningful acceleration in the talent, capital, and research progress in the robotics domain. Overlapping bodies of research appear to be heading in the direction of more general robots — toward the promise of generalist embodied AI agents. This research progress includes, among other things, the pursuit of scaling-laws hypotheses for robotics and the emergence of vision-language models applied to robot actions; advances in methods around co-training and cross-embodiment that increase the leverage of robotics data; and progress toward bridging neural nets to low-level controls for truly end-to-end robot learning.

Moreover, leading researchers in the field are spinning up commercial research efforts and new companies, and billions of dollars in capital have been allocated towards robotics startups in 2024 alone. The confluence of talent, capital, and technology in the field suggests we are in the midst of a robotics and embodied AI upswing that could eventually enable the development of a horizontal robotics platform, thus giving more developers the opportunity to innovate in this field. 

A horizontal robotics platform

One way to think about robots is that they are computers that, to date, have lacked maturity compared to other computing platforms. They largely have yet to develop the depth in operating systems, developer platforms, and other components of the ecosystem that help accelerate developer activity.

Historically, robotics companies have been built in a somewhat siloed fashion. They typically have defined a constrained automation problem and built a tailored solution for that automation problem from the ground up. Companies often collected their own data, configured their own hardware, and built their own intelligence layer, as well. While there exists middleware, like ROS (Robotic Operating System), and a standard set of hardware OEMs that provide commercial, off-the-shelf hardware for robotics solution providers to build on, these products don’t really constitute a robust, horizontal platform. Instead, they function more as discrete developer tools and components of a robotics solution (for instance, despite its name, ROS is not actually an operating system under any strict definition).

The issue with this siloed approach to development is that, while tools and infrastructure for robotics developers exist, there has been less of an apps > infrastructure > apps > infrastructure cycle than in other categories of computing. These cycles accelerate the building of key components that can constitute a platform upon which developers can build a diverse set of applications. A move towards generality, where there is a horizontal base for robotic intelligence, can enable a common foundation to accelerate this developer ecosystem.

At the core the current general robotics wave is the hypothesis that the scaling laws observed in other AI modalities will apply to robotic actions. The idea is that the bitter lesson extends to robotics — that advances in AI are a function of scaling data and compute. Rather than having specific models of robotic intelligence for specific use cases, we could have robotic foundation models that extend across environments and tasks. There is some debate as to whether scale can “solve” robotics (see here for a breakdown of arguments for and against), but the general direction of research — especially among newer commercial research teams — is towards scaling robot data as a means of building a large base model for robot actions (what is typically meant by “foundation model for robotics”).

If data scaling laws do hold, this move toward generality may constitute the beginnings of a cohesive robotics platform. Widely available, and horizontal, robotic hardware platforms and intelligence layers can drive down the cost of developing robotics applications. We could finally have a unified platform for computers to have read/write access to the physical world, which is currently scattered across various sets of capabilities for vision, sensing, manipulation, and locomotion.

Naturally, progress in AI and robotics has raised the prospect of a “ChatGPT moment” for robotics — an inflection point when the technology experiences a mass-market product breakthrough. The physical world, however, is highly variant — much more so than virtual or human-created domains — and involves an enormous amount of parameters that may make a singular product solution difficult. As such, it is possible that the breakthrough robotics moment will look less like a single consumer-grade product and more like a common operating system enabling an ecosystem of devices, developer tools, and applications — more like Android than like an iPhone.

We think a functional market structure for a new robotics will look something like this:

A brief history of robotics startups

However, any meaningful discussion of commercial robotics needs to address the elephant in the room: As a category, the commercial outcomes for robotics companies have not been good, and founders, technologists, and investors in the field have a fair amount of scar tissue.

Commercially successful outcomes such as Kiva Systems were few and far between over the last two decades, and even those were typically moderate when compared to successes in other contemporaneous markets. Failure was much more common. Some common challenges historically faced by robotics companies include:

  • Difficult economics, with R&D and deployment costs significantly greater than prevailing wage for labor for common robotics tasks. 
  • Margin compression due to crowded markets, and the commodification of robot solution providers for generic tasks like pick and place. 
  • Technical challenges of having robots perform robustly and reliably. In the physical world, you typically have to be 100% correct — 80% or a creative response is not enough. 
  • Failure to launch a product with a limited cash runway and set of iteration cycles.
  • Difficulties in go-to-market and integration.
  • Tradeoffs between constrained solvable problem spaces and market size.

Ongoing progress

These challenges are surmountable, though, and — as noted above — there are reasons to be optimistic about current developments in the space.

One driving force for optimism is a new infusion of talent into the commercial robotics category. Over the last 2 years, the leading researchers in the field have moved from academic and large commercial labs toward starting their own companies. The authors of many leading robotics papers are now in-house at places like Physical Intelligence, Skild, and a number of other newly formed companies aiming to build the robotic intelligence layer. Additionally, top AI researchers have moved into a number of companies building full-stack robotic applications, such as 1X, Figure, and more. 

We are also seeing the emergence of more teams focused on robotics efforts inside other AI and developer-facing companies, such as Nvidia’s GR00T or Hugging Face’s robotics team

Another reason for cautious optimism is a surge in capital being directed toward horizontal robotics efforts over the last year. While capital surges alone certainly do not guarantee the success of this robotics wave, it is a necessary component given the capital-intensive nature of current horizontal robotics efforts that form the foundation of a horizontal robotics platform. Moreover, a long-term view could characterize such capital patterns as the early-mid stages of a Perezian financial cycle with varying relationships between financial and production capital, but ultimately leading to a technology revolution in robotics – one where in the long term, the promise of general, intelligent robots is eventually realized. 

As one might expect, however, research rules the day. A number of related areas of research have made significant progress over the last 2 years and suggest new open questions and avenues of research:

Applications of language models to robotic planning and reasoning

The use of large language models for robotic task planning; vision-language models for reasoning around vision inputs and perception; and related applications of language models to embodied intelligence represent an opportunity to greatly improve the reasoning capabilities of robots, particularly when applying multimodal models to reasoning about physical environments and task planning. Moreover, spatial and world models represent another emerging category of large models that could greatly improve the physical reasoning abilities of robots. 

Various methods of scaling robot data

Under the hypothesis around the applicability of scaling laws to robotics, this area of work addresses what is perhaps the key bottleneck in the field. In contrast with available data sources for text, video, and images, there is no such thing as internet-scale robotics data. However, because one of the core goals of this current robotics wave is to solve general robotics via scaling, unlocking different methods of scaling robot data is a key research area. These methods include improved teleoperation and human-behavior cloning, simulation-based methods, learning from video data (especially egocentric video), and hardware setups that enable low-cost robot data collection (sometimes without robots at all). In all likelihood, we will need data collected via multiple, if not all, of these methods.

End-to-end learning and working toward low-level controls

While language models can provide greater capabilities for high-level reasoning and planning, the promise of a robotic foundation model is that this intelligence will extend to the low-level robotic controls, as well. Over the last decade or so, neural networks have taken on more and more of the robotics stack, from perception through task- and motion-planning. Bridging to the low level, such that robots are able to rely on neural networks end-to-end (i.e., from sensor input through to action output), significantly increases the capabilities of robots towards being more intelligent and general, and remains a key area of work in order to fully realize the potential of robotics foundation models. 

Cross-embodiment learning

Some research around cross-embodiment learning has shown indications of robot policies that transfer across different embodiments, or the use of data collected from one embodiment for robots of a different embodiment. This research plays an important role in the emerging robotics platform for two main reasons. First, it improves the efficiency of scaling robot data — if data collected on one embodiment can be applicable to robots of a different embodiment, it may make it more achievable to reach the required amounts of data to start seeing the results of scaling. Second, given the diversity of environments and tasks in the physical world, it is likely that there is not one optimal universal embodiment applicable across every environment-task combination. As such, robotic intelligence that is able to generalize across embodiments can greatly improve the horizontal nature and capabilities of the robotics platform, as opposed to requiring different models for different embodiments, or only being able to use one general embodiment as the universal hardware platform.

Novel and advanced hardware

Advances like reductions in sensor costs, improvements in dexterous robot hands, novel custom actuators, and new humanoid robot efforts all contribute to the improvement of hardware capabilities of the robotics platform. In general, the more dexterous and mobile a robot is the less it’s limited by hardware constraints. However, hardware advances require not just an improvement in capabilities, but also an improvement in the economics and manufacturability of these systems. These advances all contribute to the improvement of hardware capabilities of the robotics platform.

Opportunities to build

Given the activity in this robotics cycle, we think there are a number of interesting opportunities in which entrepreneurs can build important pieces of the emerging robotics platform. Here’s a non-exhaustive list:

  • Methods of crowdsourcing robotics data across diverse environments, including via low-cost hardware, VR, games, marketplaces, or mechanism design.
  • Teleoperation businesses focused on robot data collection and annotation that find creative ways of achieving advantageous economics for data collection at scale. 
  • Robotics applications with highly specialized task/environment data that may be able to take advantage of future broad improvements in robotic intelligence by building custom solutions with their domain-specific data. 
  • Robotics developer tools and platforms for simulation, data tooling and observability, teleoperations, and related workstreams.
  • An American hardware OEM for mass-manufacturable advanced robotic hardware (such as quadrupeds, legged and wheeled bimanual robots, etc.).

Toward a future of intelligent machines

Robotics is certainly a difficult category, and many teams have been thwarted by the inherent difficulties of building for the physical world. However, the emerging robotics wave suggests a number of reasons for optimism. We are seeing multiple trends across talent, capital, and research come together to form the beginnings of an emerging robotics platform. 

As we embark on this new robotics cycle, it remains to be seen whether the development arc of robotics and embodied AI ecosystem looks more like autonomous vehicles (highly siloed and centralized development over long time horizons) or language models (a variety of competing base models that support an ecosystem of decentralized developer activity). We believe the latter can lead to the emergence of a robust robotics platform that kickstarts a flywheel in the category.

If you’re building a piece of the emerging robotics platform, please reach out.

Want More a16z American Dynamism?

An update on the ideas, companies, and individuals building toward a more dynamic future.

Learn More
Recommended For You
American Dynamism

A Primer on Factory Economics for Startups

Oliver Hsu
Enterprise

Can AI Help Save Lives?

Kimberly Tan and Michael Chime
American Dynamism

Technology in 1776

Christian Keil
American Dynamism

Everything is Computer

Ryan McEntush

Want More American Dynamism?

An update on the ideas, companies, and individuals building toward a more dynamic future.

Sign Up On Substack

Views expressed in “posts” (including podcasts, videos, and social media) are those of the individual a16z personnel quoted therein and are not the views of a16z Capital Management, L.L.C. (“a16z”) or its respective affiliates. a16z Capital Management is an investment adviser registered with the Securities and Exchange Commission. Registration as an investment adviser does not imply any special skill or training. The posts are not directed to any investors or potential investors, and do not constitute an offer to sell — or a solicitation of an offer to buy — any securities, and may not be used or relied upon in evaluating the merits of any investment.

The contents in here — and available on any associated distribution platforms and any public a16z online social media accounts, platforms, and sites (collectively, “content distribution outlets”) — should not be construed as or relied upon in any manner as investment, legal, tax, or other advice. You should consult your own advisers as to legal, business, tax, and other related matters concerning any investment. Any projections, estimates, forecasts, targets, prospects and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Any charts provided here or on a16z content distribution outlets are for informational purposes only, and should not be relied upon when making any investment decision. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. In addition, posts may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein. All content speaks only as of the date indicated.

Under no circumstances should any posts or other information provided on this website — or on associated content distribution outlets — be construed as an offer soliciting the purchase or sale of any security or interest in any pooled investment vehicle sponsored, discussed, or mentioned by a16z personnel. Nor should it be construed as an offer to provide investment advisory services; an offer to invest in an a16z-managed pooled investment vehicle will be made separately and only by means of the confidential offering documents of the specific pooled investment vehicles — which should be read in their entirety, and only to those who, among other requirements, meet certain qualifications under federal securities laws. Such investors, defined as accredited investors and qualified purchasers, are generally deemed capable of evaluating the merits and risks of prospective investments and financial matters.

There can be no assurances that a16z’s investment objectives will be achieved or investment strategies will be successful. Any investment in a vehicle managed by a16z involves a high degree of risk including the risk that the entire amount invested is lost. Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by a16z is available here: https://a16z.com/investments/. Past results of a16z’s investments, pooled investment vehicles, or investment strategies are not necessarily indicative of future results. Excluded from this list are investments (and certain publicly traded cryptocurrencies/ digital assets) for which the issuer has not provided permission for a16z to disclose publicly. As for its investments in any cryptocurrency or token project, a16z is acting in its own financial interest, not necessarily in the interests of other token holders. a16z has no special role in any of these projects or power over their management. a16z does not undertake to continue to have any involvement in these projects other than as an investor and token holder, and other token holders should not expect that it will or rely on it to have any particular involvement.

With respect to funds managed by a16z that are registered in Japan, a16z will provide to any member of the Japanese public a copy of such documents as are required to be made publicly available pursuant to Article 63 of the Financial Instruments and Exchange Act of Japan. Please contact compliance@a16z.com to request such documents.

For other site terms of use, please go here. Additional important information about a16z, including our Form ADV Part 2A Brochure, is available at the SEC’s website: http://www.adviserinfo.sec.gov.