I enjoyed the 2025 roundup pieces from Karpathy, Simon and many others, and they have me thinking about 2026. The AI Apps ecosystem is maturing in some expected ways and some surprising ones. We’ve figured out how to make code cheap, but it hasn’t yet diffused across the enterprise (or world) in the way that’s implied by the lower costs, and I don’t think we’ve realized even 10% of what that means for how companies get built and what software will exist. Meanwhile, there are still fundamental tooling problems to solve—like the fact that all our tools are for making, not for thinking.

Thinking tools vs Making tools

One big change I expect is the nature of tools themselves. All of the tools we use for knowledge work are focused on execution: IDEs for creating code, Figma for creating design, spreadsheets for creating models. When it comes to tools for exploration – tools that help us think – we don’t really have any modern products outside of how the LLMs themselves have emerged as thinking partners.

As coding agents are able to work with increasing accuracy and longer time horizons, the hard problem moves from how do I build it to what do I build. You can imagine a near future PM who sets broad goals for their AI and wakes up every morning to review 2-3 features the model dreamt up, executed on, and A/B tested overnight. However in my experience the models are still not very good at deciding what to build next – the ideas are bland, derivative, and generally lack the spark you see from really good new product thinking. So I think the spiritual successors of coding tools, design tools, and productivity tools are very focused on exploration vs execution. Coding tools are already leading the way here; Cursor is the furthest along and I thought Antigravity was interesting in being “agent first” (exploration first) in their product design.

Software eats all the “service” functions in the organization

I’ve always noticed a distinction between “power” functions and “service” functions in software companies – power functions (engineering/product/performance marketing) tend to be closer to software, while service functions (legal/finance/HR) tend to be further from software and more human capital levered.

Coding agents have two important implications for the enterprise. The first is that every team + every task (marketing, legal, procurement, finance) should be software first, and all of these leaders are going to have to learn to reach for a software toolbox before the process / human systems they’ve traditionally relied upon. Many of these organizations will embrace domain specific products like Harvey, while others yet will use “bare metal” coding agents like Codex or Claude Code. Every team should be a software team.

The second is that an enterprise (particularly one that produces software) can be dramatically more ambitious in what software they should produce, and the entire ideation and prioritization pipeline is going to have to be rebooted to accommodate for this. Every feature that can be built will be built, and most enterprises simply aren’t ready for this reality.

I think the culture change problem will be as hard as the organizational change problem.

Compounding AI apps

As we enter year two of reasoning models I expect to see continued divergence between AI native apps and AI models, with Apps combining the orchestration of cutting edge models, domain specific UI, and the very extensive feature surface that is now very very cheap to build. This is the natural implication of what we called “Narrow Startups” earlier this year. Extraordinary specialization is now possible and I think this is a part of the strong pro-case for Apps as distinct and increasingly divergent from models.

It feels like the labs and big tech are about as “jagged” in their capabilities as the models they produce. They are formidable in their areas of focus but also have complex commitments (i.e. Google’s commitments to regulators not to further intermediate the internet) and hard prioritization problems (OpenAI is simultaneously competing to be the leading consumer company, enterprise company, model company, hardware company). So I think a bad assumption is that the apps layer will be subsumed by models – even in domains like coding which are central to model progress and lab focus, we see a thriving ecosystem of startups with > $1b of new revenue generated in 2025 alone.

We previously outlined a framework for areas that advantage AI apps – namely domains that benefit from being multi-model, cornered data resources, network products and ecosystems with a lot of feature surface. If we combine this with Karpathy’s excellent articulation of “thick” AI apps – multi-model orchestration, autonomy slider, context engineering etc. you can start to see what AI apps look like as they mature.

Humans discover “the rest” of AI

Eugenia has been the best thinker on how the command line UI has held back everyday consumers from some of the best capabilities of AI. This is beginning to change: Wabi has been a big catalyst in exposing code generation to consumers, the Images tab in ChatGPT/Grok has done the same for image gen and with a little luck Apps Directory and Skills will do the same for MCPs and prompt plugins.

I liked Dan Wang’s critique of how Silicon Valley may be a little culturally tone deaf to the impact of AI and I think that getting more consumers making stuff partially alleviates this. Generating a tiny app in 2025 was as delightful as generating a poem in 2023 but most consumers still don’t know this exists. I also think this partially subverts Nikita’s note on who creates stuff, which is a real black pill.

Notes for (incumbent) CEOs

While we are appropriately focused on builders I have a few thoughts for CEOs who are already at scale and thinking about how to navigate the AI transition. One is to look at best in class examples of how models collapse all customer facing roles (sales, support, collections) into a single function with a broad goal. The second is to embrace the note above about being software first in every function – non-technical functions embracing models is how the enterprise gets broad operating leverage. Finally I think a lot about demanding more ambitious products and more ambitious prices – if Tesla can deliver FSD coast to coast and Claude Code can be written with Claude Code then we already have AGI for the near term purposes of most enterprise tasks.

Finally … have fun

No one tells you that you are living in the good old days until they are gone, so consider this your notice. This product cycle is less centralized, more software led and simply more damn fun for technologists than any in recent memory. I hope everyone is having as much fun as I am exploring these new technologies, discussing their implications and simply making more new stuff.