American Dynamism

Frontier Systems for the Physical World

Oliver Hsu Posted April 15, 2026

Frontier Systems for the Physical World Table of Contents

The dominant paradigm in AI today, insofar as it is used in production-ready settings, is organized around language and code. The scaling laws governing large language models are well-characterized, the commercial flywheel of data, compute, and algorithmic improvement is spinning, and the returns to incremental capability gains remain large and mostly legible. This paradigm has earned the capital and attention it commands.

But a set of adjacent and related fields has been making meaningful strides in its gestation phase. These areas of activity include VLAs, WAMs, and other approaches to generalist robotics models, physical and scientific reasoning in the pursuit of AI scientists, and novel interfaces for human-computer interaction (including BCIs and neurotech) that take advantage of advances in AI to rethink how we interact with machines. Beyond technical progress, each of these areas has seen the beginnings of an influx in talent, capital, and founder activity. The technical primitives for extending frontier AI into the physical world are maturing concurrently, and the pace of progress over the past eighteen months suggests that these fields could soon enter a scaling regime of their own.

In a given technology paradigm, the areas with the greatest delta between their current perceived capabilities and medium-term upside potential tend to be those that benefit from the same scaling dynamics driving the current frontier, but sit one step removed from the incumbent paradigm — close enough to inherit its infrastructure and research momentum, but distant enough to require non-trivial additional work. This distance serves a dual function: it creates a natural moat against fast-following, and it defines a problem space that is richer, less explored, and more likely to produce emergent capabilities precisely because the easy paths have not already been taken.

Three domains fit this description today: robot learning, autonomous science (particularly in materials and life sciences), and new human-machine interfaces (including brain-computer interfaces, silent speech, neural wearables, and novel sensory modalities like digitized olfaction). These are not entirely separate efforts, and thematically are part of a group of emerging frontier systems for the physical world. They share a common substrate of technical primitives, like learned representations of physical dynamics, architectures for embodied action, simulation and synthetic data infrastructure, an expanding sensory manifold, and closed-loop agentic orchestration. They are mutually reinforcing in ways that create compounding dynamics across domains. And they are the areas where qualitatively new AI capabilities are most likely to emerge from the interaction of model scale, physical grounding, and novel data modalities.

This essay surveys the technical primitives underlying these systems, examines why these three domains specifically represent frontier opportunities, and proposes that their mutual reinforcement constitutes a structural flywheel for extending AI into the physical world.

Primitives

Before examining specific application domains, it’s worth understanding the shared technical foundations that make these frontier systems possible. Five main primitives underpin the advance of frontier AI into the physical world. These technologies aren’t necessarily specific to any particular application area; rather, they are the building blocks that enable the creation of systems that extend AI to the physical world. Their concurrent maturation is what makes the emerging moment distinctive.

Learned Representations of Physical Dynamics

The most fundamental primitive is the ability to learn compressed, general-purpose representations of how the physical world behaves — how objects move, deform, collide, and respond to force. Without this, every physical world AI system must learn the physics of its domain from scratch, a prohibitively expensive proposition.

Multiple architectural families are converging on this capability from different directions. Vision-Language-Action models (VLAs) approach it from above: they take pretrained vision-language models—already rich with semantic understanding of objects, spatial relations, and language—and extend them with action decoders that output motor commands. The key insight is that the enormous cost of learning to see and understand the world can be amortized across internet-scale image-text pretraining. Models like π₀ from Physical Intelligence, Google DeepMind’s Gemini Robotics, or NVIDIA’s GR00T N1 have demonstrated this architecture at increasing scale.

World Action Models (WAMs) approach the same capability from below: they build on video diffusion transformers pretrained on internet-scale video, inheriting rich priors about physical dynamics—how objects fall, occlude, and interact under force—and coupling these priors with action generation. NVIDIA’s DreamZero demonstrates zero-shot generalization to entirely new tasks and environments, achieving a meaningful improvement in real-world generalization while enabling cross-embodiment transfer from human video demonstrations with only small amounts of adaptation data.

A third path, and one that may be the most instructive for understanding where this field is heading, dispenses with both pretrained VLMs and video diffusion backbones entirely. Generalist’s GEN-1 is a native embodied foundation model trained from scratch on over half a million hours of real-world physical interaction data, collected primarily through low-cost wearable devices on humans performing everyday manipulation tasks. It is not a VLA in the standard sense (as there is no vision-language backbone being fine-tuned), nor is it a WAM. It is instead a first-class foundation model for physical interaction, designed from the ground up to learn representations of dynamics from the statistics of human-object contact rather than from internet images, text, or video.

Spatial intelligence, like that being built by companies like World Labs, is valuable for this primitive because it addresses a representation gap that VLAs, WAMs, and native embodied models all share: none of them explicitly model the three-dimensional structure of the scenes they operate in. VLAs inherit 2D visual features from image-text pretraining. WAMs learn dynamics from video, which is a 2D projection of 3D reality. Models that learn from wearable sensor data capture forces and kinematics, but not scene geometry. Spatial intelligence models can help fill this gap by learning to reconstruct, generate, and reason about the full 3D structure of physical environments — geometry, lighting, occlusion, object relationships, and spatial layout.

Convergence between approaches here is the point. Whether the representations are inherited from VLMs, learned through video co-training, or built natively from physical interaction data, the underlying primitive is the same: compressed, transferable models of how the physical world behaves. The data flywheel for these representations is enormous and largely untapped — encompassing not just internet video and robot trajectories, but the vast corpus of human physical experience that wearable devices are now beginning to capture at scale. The same representations serve a robot learning to fold towels, a self-driving laboratory predicting reaction outcomes, and a neural decoder interpreting the motor cortex’s plan for grasping.

Architectures for Embodied Action

Representations of physics are necessary but insufficient. Translating understanding into reliable physical action requires architectures that solve several interrelated problems: mapping high-level intent to continuous motor commands, maintaining coherence over long action horizons, operating within real-time latency constraints, and improving with experience.

The dual-system hierarchical architecture — separating a slow, powerful vision-language model for scene understanding and task reasoning (System 2) from a fast, lightweight visuomotor policy for real-time control (System 1)—has converged as the standard design pattern for complex embodiments. GR00T N1, Gemini Robotics, and Figure’s Helix all adopt variants of this approach, resolving the fundamental tension between the rich reasoning that large models provide and the millisecond-scale control frequencies that physical tasks demand. Alternatively, Generalist takes an approach of harmonic reasoning for simultaneous thinking and action.

The action generation mechanism itself is evolving rapidly. Flow matching and diffusion-based action heads, pioneered by π₀, have emerged as the dominant approach for producing smooth, high-frequency continuous actions, displacing the discrete tokenization methods borrowed from language modeling. These methods treat action generation as a denoising process analogous to image synthesis, yielding trajectories that are physically smoother and more robust to compounding errors than autoregressive token prediction.

But the most consequential architectural development may be the extension of reinforcement learning to pretrained VLAs — the idea that a foundation model trained on demonstrations can then improve through its own autonomous practice, much as a person refines a skill through repetition and self-correction. Physical Intelligence’s work on π*₀.₆ represents the clearest demonstration of this principle at scale. Their method, RECAP (RL with Experience and Corrections via Advantage-conditioned Policies), addresses a problem that pure imitation learning cannot solve: credit assignment over long task horizons. If a robot grasps an espresso machine’s portafilter at a slightly wrong angle, the failure may not manifest until several steps later when insertion fails. Imitation learning has no mechanism to attribute the failure to the earlier grasp; RL does. RECAP trains a value function that estimates the probability of success from any intermediate state, then conditions the VLA to select high-advantage actions. Critically, it incorporates heterogeneous data (demonstrations, on-policy autonomous experience, expert teleoperated corrections provided during execution, etc.) into a unified training pipeline.

The results of this approach are encouraging for the future of RL for actions. π*₀.₆ folds laundry across 50 novel garment types in real homes, reliably assembles boxes, and prepares espresso drinks on a professional machine, running continuously for hours without human intervention. On the most difficult tasks, RECAP more than doubles throughput and cuts failure rates by half or more compared to the imitation-only baseline. The system also demonstrates that RL post-training yields qualitatively different behaviors from imitation, like smoother recoveries, more efficient grasp strategies, and adaptive error correction that were not present in the demonstration data.

These gains suggest that the same compute-scaling dynamics that drove LLMs from GPT-2 to GPT-4 are beginning to operate in the embodied domain — just earlier on the curve, and with an action space that is continuous, high-dimensional, and subject to the unforgiving constraints of real-world physics.

Simulation and Synthetic Data as Scaling Infrastructure

In language, the data problem was solved by the internet: trillions of tokens of naturally occurring text, freely available. In the physical world, the data problem is orders of magnitude harder – as is now very well understood, indicated by a rapid increase in startups aiming to build data vendors for the physical world. Real-world robot trajectories are expensive to collect, dangerous to scale, and limited in diversity. A language model can learn from a billion conversations; a robot cannot have a billion physical interactions (yet).

Simulation and synthetic data generation are the infrastructure layer that resolves this constraint, and their maturation is one of the key reasons physical world AI is accelerating now rather than five years ago.

The modern simulation stack combines physics-based simulation engines, photorealistic rendering via ray tracing, procedural environment generation, and world foundation models that bridge the sim-to-real gap by generating photorealistic video from simulation inputs. The pipeline runs from neural reconstruction of real environments (using only a smartphone), through population with physically accurate 3D assets, to large-scale synthetic data generation with automatic annotation.

The significance of improvements in the simulation stack is, intuitively, changing the economic assumptions that underpin physical world AI. If the bottleneck in physical AI shifts from collecting real data to designing diverse virtual environments, the cost curve collapses. Simulation scales with compute, not with human labor or physical hardware. This transforms the economics of training physical world AI systems in the same way that internet-scale text data transformed the economics of training language models, and it means that investment in simulation infrastructure has outsized leverage on the entire ecosystem.

Simulation, however, is not only a robotics primitive. The same infrastructure serves autonomous science (digital twins of laboratory equipment, simulated reaction environments for hypothesis pre-screening), new interfaces (simulated neural environments for training BCI decoders, synthetic sensory data for calibrating novel sensors), and other domains where AI interacts with the physical world. Simulation is the universal data engine for physical world AI.

Expanding the Sensory Manifold

The physical world communicates through a far richer set of signals than vision and language. Touch conveys information about material properties, grip stability, and contact geometry that is invisible to cameras. Neural signals encode motor intent, cognitive state, and perceptual experience at bandwidths that dwarf any current human-computer interface. Subvocal muscle activity encodes speech intention before any sound is produced. The fourth primitive is the rapid expansion of AI’s sensory access to these previously inaccessible modalities, driven not only by research, but by an ecosystem building the devices, software, and infrastructure to capture and process these signals at consumer scale.

The most visible indicator of this expansion is the emergence of new device categories. These include AR devices, which have massively improved in user experience and form factor in recent years (with companies building applications on this platform for both consumer and industrial use cases); voice-first AI wearables, provide more comprehensive context for language-based AI by accompanying users into the physical world. Longer term, neural interfaces may open even more comprehensive modalities of interaction. AI has presented a shift in computing that has created an opportunity to dramatically advance the way humans interact with computers, and companies like Sesame are building new modalities and devices to do that.

More dominant modalities like voice create tailwinds for emerging means of interacting with computers. As products like Wispr Flow push voice into more of a primary input modality (an advantage given its high information density), the market dynamics around silent speech interfaces also become more favorable. Silent speech devices, which use various sensors to detect tongue and vocal cord movements to decipher speech without sound, represent an even higher information density modality for interacting with computers and AI.

Brain-computer interfaces, both invasive and non-invasive, represent the deeper frontier, and the commercial ecosystem around them continues to progress. The signal there would be progress towards the convergence of clinical validation, regulatory clearance, platform integration, and institutional capital around a technology category that was purely academic a few years ago.

Tactile sensing is entering embodied AI architectures, as some models in robot learning begin to explicitly include touch as a first class part of their approach. Olfactory interfaces are becoming real engineering artifacts: wearable displays using miniaturized odor generators with millisecond response times have been demonstrated for mixed-reality applications, while smell models are being built to pair with visual AI systems for chemical process monitoring.

The pattern across all of these developments is that they converge on each other in the limit. AR glasses generate continuous visual and spatial data about how users interact with physical environments. EMG wristbands capture the statistics of human motor intent. Silent speech interfaces capture the mapping between subvocal articulation and linguistic output. BCIs capture neural activity at the highest resolution available. Tactile sensors capture the contact dynamics of physical manipulation. Each new device category is also a data-generation platform that feeds the models underlying multiple application domains. A robot trained on EMG-derived motor intent data learns different grasping strategies than one trained only on teleoperation. A laboratory interface that responds to subvocal commands enables a different kind of scientist-machine interaction than a keyboard. A neural decoder trained on high-density BCI data produces representations of motor planning that are inaccessible through any other channel.

The proliferation of these devices is expanding the effective dimensionality of the data manifold available for training frontier physical world AI systems — and the fact that much of this expansion is being driven by well-capitalized consumer product companies, not just academic labs, means that the data flywheel can scale with market adoption.

Closed-Loop Agentic Systems

The final primitive is more architectural. It is the ability to orchestrate perception, reasoning, and action into sustained, autonomous, closed-loop systems that operate over long time horizons without human intervention.

In language models, the analogous development was the emergence of agentic systems — multi-step reasoning chains, tool use, and self-correcting workflows that advanced models from single-turn question-answerers into autonomous problem-solvers. In the physical world, the same transition is underway, but the requirements are far more demanding. A language agent that makes an error can backtrack costlessly, whereas a physical agent that drops a beaker of reagent cannot.

Three properties distinguish physical world agentic systems from their digital counterparts. First, they require embodiment in the experimental or operational loop: direct interfaces to raw instrument streams, physical state sensors, and actuation primitives that ground reasoning in physical reality rather than text descriptions of it. Second, they require long-horizon persistence: memory, provenance tracking, safety monitoring, and recovery behaviors that maintain continuity across operational cycles rather than treating each task as a standalone episode. Third, they require closed-loop adaptation: the ability to revise strategies based on physical outcomes, not just textual feedback.

This primitive is what transforms individual capabilities (a good world model, a reliable action architecture, a rich sensor suite) into functioning systems that can operate autonomously in the physical world. It is the integration layer, and its maturation is what makes the three application domains described below possible as real-world deployments rather than isolated research demonstrations.

Three Domains

The primitives described above are general-purpose enabling layers. They do not, by themselves, specify where the most important applications will emerge. Many domains involve physical action, physical measurement, or physical sensing. What distinguishes a frontier system from a merely improved existing system is the degree to which increasing model capabilities and scaling infrastructure compound within the domain — creating not just better performance, but qualitatively new capabilities that were previously impossible.

Robotics, AI-driven science, and new human-machine interfaces are the three domains where this compounding is strongest. Each one assembles the primitives in a distinct configuration. Each one is bottlenecked by limitations that the primitives discussed are lifting. And each one generates, as a byproduct of its operation, exactly the kind of structured physical data that makes the primitives themselves better, closing a feedback loop that accelerates the entire system. They are not the only physical AI domains worth watching, but they are the ones where the interaction between frontier AI capabilities and physical reality is densest, and where the distance from the current language/code paradigm creates the most space for emergence while remaining highly complementary and benefitting from these capabilities.

Robotics

Robotics is the most literal embodiment of the thesis: a domain that requires an AI system to perceive, reason about, and physically act upon the material world in real time. It is also the domain that most directly stress-tests every primitive simultaneously.

Consider what a general-purpose robot must do to fold a towel. It needs a learned representation of how deformable materials behave under force—a physics prior that no amount of language pretraining provides. It needs an action architecture that can translate a high-level instruction into a sequence of continuous motor commands at control frequencies of 20Hz or more. It needs simulation-generated training data, because no one has collected millions of real-world towel-folding demonstrations. It needs tactile feedback to detect slip and adjust grip force, because vision alone cannot distinguish a firm grasp from one about to fail. And it needs a closed-loop controller that can detect when the fold has gone wrong and recover, rather than blindly executing a memorized trajectory.

This is why robotics is a frontier system rather than a mature engineering discipline with better tools. The primitives do not merely improve existing robotic capabilities; they unlock categories of manipulation, locomotion, and interaction that were previously impossible outside of narrowly controlled industrial settings.

The frontier has advanced meaningfully in recent years, as we’ve previously written. The first generation of VLAs demonstrated that foundation models can control robots across diverse tasks. Architectural advancements have made progress on bridging the high level reasoning and low level controls in robotic systems. On-device inference is becoming feasible, and cross-embodiment transfer means a model can adapt to an entirely new robot platform with limited amounts of data. The central remaining challenge is reliability at scale, which remains the bottleneck to deployments. Even 95% per-step success yields only 60% on a 10-step task chain, and production environments demand far better. This is where the RL post-training holds high potential, and can help us move towards the capabilities and robustness that would indicate a domain entering its scaling regime.

These advances have implications for market structure. For decades, value in robotics accrued to the mechanical system itself, and while that remains a key part of the stack, as learned policies become more standard, value migrates to models, training infrastructure, and data flywheels. But robotics also feeds back into the primitives previously discussed: every real-world trajectory is training data for better world models, every deployment failure reveals gaps in simulation coverage, and every new embodiment tested expands the diversity of physical experience available for pretraining. Robotics is both the most demanding consumer of the primitives and one of their most important sources of improvement signal.

Autonomous Science

If robotics tests the primitives against the demands of real-time physical action, autonomous science tests them against something slightly different – sustained, multi-step reasoning about causally complex physical systems, over time horizons measured in hours or days, with experimental outcomes that must be interpreted, contextualized, and used to revise strategy.

AI-driven science is the domain where the primitives combine most completely. A self-driving laboratory requires learned representations of physical and chemical dynamics to predict what an experiment will produce. It requires embodied action to pipette reagents, position samples, and operate analytical instruments. It requires simulation to pre-screen candidate experiments and allocate scarce instrument time. It requires expanded sensing, such as spectroscopy, chromatography, mass spectrometry, and increasingly novel chemical and biological sensors, to characterize outcomes. And it requires the closed-loop agentic orchestration primitive more than any other domain – the ability to sustain multi-cycle hypothesis-experiment-analysis-revision workflows without human intervention, maintaining provenance, monitoring safety, and adapting strategy based on what each cycle reveals.

No other domain draws on these primitives this deeply. This is what makes autonomous science a frontier system rather than simply laboratory automation with better software. Companies like Periodic Labs and Medra unify scientific reasoning capabilities with the physical capabilities of testing that reasoning in materials science and life sciences respectively, enabling scientific iteration and generating experimental training data along the way.

The value to such systems is fairly intuitive. Traditional materials discovery takes several years from concept to commercialization, and AI-accelerated workflows can potentially compress this process to far less. The binding constraint is shifting from hypothesis generation, which foundation models can assist with readily, to fabrication and validation, which requires physical instrumentation, robotic execution, and closed-loop optimization. SDLs aim to address exactly this bottleneck.

An additional important property of autonomous science, across the landscape of these systems for the physical world, is its role as a data engine. Every experiment an SDL runs produces not just a scientific result, but a physically grounded, experimentally validated training signal. A measurement of how a polymer crystallizes under specific conditions enriches the world model’s understanding of material dynamics. A validated synthesis route becomes training data for physical reasoning. A characterized failure teaches the agentic system where its predictions break down. This data produced by an AI scientist conducting a real experiment is qualitatively different from internet-scraped text or simulation output, in that it is structured, causal, and empirically verified. It is the kind of data that physical reasoning models need most and can get from no other source. Autonomous science is the domain that directly converts physical reality into the structured knowledge that improves the entire ecosystem of physical world AI.

New Interfaces

Robotics extends AI into physical action, and autonomous science extends it into physical investigation. New interfaces extend it into the direct coupling of artificial intelligence with human perception, sensory experience, and the body’s own signals, through devices that range from AR glasses and EMG wristbands to implantable brain-computer interfaces. What unifies this category is not a single technology but a shared function of expanding the bandwidth and modality of the channel between human intelligence and AI systems, and in the process generating data about human-world interaction that is directly useful for building physical world AI.

The distance from the incumbent paradigm is the source of both the challenge and the potential in this domain. Language models know conceptually about these modalities, but are not necessarily native to the movement patterns for silent speech, the geometry of olfactory receptor binding, or the temporal dynamics of EMG signals. The representations that decode these signals must be learned from the expanding sensory manifold. There is no internet-scale pretraining corpus for many of these modalities, and the data often must come from the interfaces themselves, which means the systems and their training data co-evolve in a way that has no analogue in language AI.

The near-term expression of this domain is the rapid emergence of AI wearables as a consumer product category. AR glasses are perhaps the most visible instance of this category, along with other wearable consumer devices that take a voice or vision-first input modality.

This ecosystem of consumer devices creates both new hardware platforms for AI to extend into the physical world, along with being infrastructure for physical world data. A person wearing AI glasses can produce a continuous first-person video stream of how humans navigate physical environments, manipulate objects, and interact with the world. Other wearables capture continuous biometric and movement data. Taken together, the installed base of AI wearables is becoming a distributed data-collection network for physical-world AI, instrumenting human physical experience at a scale that was previously impossible. Consider the scale of the smartphone as a consumer device – the proliferation of a new type of consumer device that enables new modalities for computers to sensing the world at that scale also creates a massive new channel for AI interacting with the physical world.

Brain-computer interfaces represent the deeper frontier. Neuralink has implanted multiple patients and is iterating on its surgical robotics and decoder software. Synchron’s endovascular Stentrode has been used to give paralyzed users control over digital and physical environments. Echo Neurotechnologies is developing a BCI system for speech restoration that builds on their work in high-resolution cortical speech decoding. Moreover, new companies like Nudge have been formed to aggregate talent and capital to build new neural interfaces and platforms for interacting with the brain. The technical milestones in the research sphere are also noteworthy. The BISC chip demonstrated wireless neural recording at a density of 65,536 electrodes on a single chip, and the BrainGate team decoded inner speech directly from motor cortex.

The through-line connecting everything from AR glasses, AI wearables, silent speech devices, and implantable BCIs is not just that they are all interfaces. It is that they collectively constitute a spectrum of increasingly high-bandwidth channels between human physical experience and AI systems — and every point on that spectrum helps support the primitives underlying all three domains in this essay for continued progress. A robot trained on high quality egocentric video from millions of AI glasses wearers learns different manipulation priors than one trained on curated teleoperation datasets; a laboratory AI that responds to subvocal commands operates with a different latency and fluidity than one controlled by a keyboard; a neural decoder trained on high-density BCI data produces representations of motor planning that are inaccessible through any other channel.

New interfaces are a mechanism by which the sensory manifold itself grows by opening data channels between the physical world and AI that did not previously exist. And the fact that this expansion is being driven by consumer device companies that seek to deploy products at scale means the data flywheel will accelerate with consumer adoption.

Systems for the Physical World

The reason to view robotics, autonomous science, and new interfaces as different instances of frontier systems combining the same primitives is that they are mutually enabling in ways that compound.

Robotics enables autonomous science. Self-driving laboratories are, at their core, robotic systems. The manipulation capabilities developed for general-purpose robotics, such as dexterous grasping, liquid handling, precise positioning, multi-step task execution, are directly transferable to laboratory automation. As robotics models improve in generality and robustness, the range of experimental protocols that SDLs can execute autonomously expands. Every advance in robot learning lowers the cost and raises the throughput of autonomous experimentation.

Autonomous science enables robotics. The scientific data generated by self-driving labs, such as validated physical measurements, causal experimental results, materials property databases, can provide structured, grounded training data that world models and physical reasoning engines need to improve. Moreover, the materials and devices that next-generation robots require (e.g. better actuators, more sensitive tactile sensors, higher-density batteries, etc.) are themselves products of materials science. Autonomous discovery platforms that accelerate materials innovation can directly improve the hardware substrate on which robot learning operates.

New interfaces enable robotics. AR devices are a scalable way of gathering data on perceiving and interacting with the physical environment. Neural interfaces generate data about human motor intent, cognitive planning, and sensory processing. These data are invaluable for training robot learning systems, particularly for tasks that involve human-robot collaboration or teleoperation.

There is a deeper point here about the nature of frontier AI progress itself. The language/code paradigm has achieved extraordinary results and continues to show strong improvement in the scaling era. The physical world offers an almost unbounded supply of novel problems, data types, feedback signals, and evaluation criteria. By grounding AI systems in physical reality (through robots that manipulate objects, laboratories that synthesize materials, and interfaces that connect to the biological and physical world) we open new scaling axes that are complementary to the existing digital frontier – and likely mutually improving.

The emergent behaviors we should expect from these systems are difficult to predict precisely, because emergence by definition arises from the interaction of capabilities that are individually well-understood but collectively novel. But the historical pattern is certainly encouraging. When AI systems gain access to new modalities of interaction with the world — when they can see (computer vision), when they can speak (speech recognition), when they can read and write (language models) — the resulting capabilities are qualitatively larger than the sum of the constituent improvements. The transition to physical world systems represents the next such phase transition In this sense, the primitives discussed here are being built now, and could enable frontier AI systems to perceive, reason about, and interact with the physical world, unlocking significant amounts of value and progress in the physical realm.

About the Contributor

Oliver Hsu

is a partner on the American Dynamism investing team at a16z, where he focuses on emerging computing platforms and applications for the physical world.

Want More a16z American Dynamism?

An update on the ideas, companies, and individuals building toward a more dynamic future.

Learn More

Recommended For You

American Dynamism

How to Win a Space War

Christian Keil and Alex Oliver

General

Technology is Security, for America and Her Allies

Anne Neuberger

American Dynamism

Keeping the Drone Swarm Alive

John Aguillard and Ryan McEntush

American Dynamism

Welcome General (R) Bryan P. Fenton

David Ulevitch

American Dynamism

“No Man Left Behind”: American Technology Ships with Our Values

Daniel Penny, Alex Oliver, and Zachary Chen

Recommended for You

American Dynamism

How to Win a Space War

Christian Keil and Alex Oliver

General

Technology is Security, for America and Her Allies

Anne Neuberger

American Dynamism

Keeping the Drone Swarm Alive

John Aguillard and Ryan McEntush

American Dynamism

Welcome General (R) Bryan P. Fenton

David Ulevitch

American Dynamism

“No Man Left Behind”: American Technology Ships with Our Values

Daniel Penny, Alex Oliver, and Zachary Chen

American Dynamism

DoW Contracting for Startups 101

Ryan McEntush, Leila Hay, and Alex Oliver

American Dynamism

American credulity, American dynamism

Christian Keil

American Dynamism

Antarctica, and the Extreme Logistics of Human Exploration

Christian Keil

American Dynamism

We Can (and Do) Solve Crime

David Ulevitch and Alex Immerman

American Dynamism

A Primer on Factory Economics for Startups

Oliver Hsu

Enterprise

Can AI Help Save Lives?

Kimberly Tan and Michael Chime

American Dynamism

Technology in 1776

Christian Keil

Want More American Dynamism?

An update on the ideas, companies, and individuals building toward a more dynamic future.

Views expressed in “posts” (including podcasts, videos, and social media) are those of the individual a16z personnel quoted therein and are not the views of a16z Capital Management, L.L.C. (“a16z”) or its respective affiliates. a16z Capital Management is an investment adviser registered with the Securities and Exchange Commission. Registration as an investment adviser does not imply any special skill or training. The posts are not directed to any investors or potential investors, and do not constitute an offer to sell — or a solicitation of an offer to buy — any securities, and may not be used or relied upon in evaluating the merits of any investment.

The contents in here — and available on any associated distribution platforms and any public a16z online social media accounts, platforms, and sites (collectively, “content distribution outlets”) — should not be construed as or relied upon in any manner as investment, legal, tax, or other advice. You should consult your own advisers as to legal, business, tax, and other related matters concerning any investment. Any projections, estimates, forecasts, targets, prospects and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Any charts provided here or on a16z content distribution outlets are for informational purposes only, and should not be relied upon when making any investment decision. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. In addition, posts may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein. All content speaks only as of the date indicated.

Under no circumstances should any posts or other information provided on this website — or on associated content distribution outlets — be construed as an offer soliciting the purchase or sale of any security or interest in any pooled investment vehicle sponsored, discussed, or mentioned by a16z personnel. Nor should it be construed as an offer to provide investment advisory services; an offer to invest in an a16z-managed pooled investment vehicle will be made separately and only by means of the confidential offering documents of the specific pooled investment vehicles — which should be read in their entirety, and only to those who, among other requirements, meet certain qualifications under federal securities laws. Such investors, defined as accredited investors and qualified purchasers, are generally deemed capable of evaluating the merits and risks of prospective investments and financial matters.

There can be no assurances that a16z’s investment objectives will be achieved or investment strategies will be successful. Any investment in a vehicle managed by a16z involves a high degree of risk including the risk that the entire amount invested is lost. Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by a16z is available here: https://a16z.com/investments/. Past results of a16z’s investments, pooled investment vehicles, or investment strategies are not necessarily indicative of future results. Excluded from this list are investments (and certain publicly traded cryptocurrencies/ digital assets) for which the issuer has not provided permission for a16z to disclose publicly. As for its investments in any cryptocurrency or token project, a16z is acting in its own financial interest, not necessarily in the interests of other token holders. a16z has no special role in any of these projects or power over their management. a16z does not undertake to continue to have any involvement in these projects other than as an investor and token holder, and other token holders should not expect that it will or rely on it to have any particular involvement.

With respect to funds managed by a16z that are registered in Japan, a16z will provide to any member of the Japanese public a copy of such documents as are required to be made publicly available pursuant to Article 63 of the Financial Instruments and Exchange Act of Japan. Please contact compliance@a16z.com to request such documents.

For other site terms of use, please go here. Additional important information about a16z, including our Form ADV Part 2A Brochure, is available at the SEC’s website: http://www.adviserinfo.sec.gov.

The Latest

Investing in Netris

Investing in Mirendil

The World-Building Doors Are Open, Again.

Investing in Probook