a16z Podcast

Tracking the Trends: AI, WebRTC, Crypto, and Full Stack Startups

Elad Gil, Sep Kamvar, and Chris Dixon

Posted October 11, 2020

Today’s episode is a conversation about four big trends in the tech world. Any one of these trends would be notable on its own, but we cover all four in this hallway-style chat, as a16z General Partner Chris Dixon talks with Sep Kamvar (professor of Media Arts and Sciences at MIT and now cofounder of cryptocurrency platform Celo); and Elad Gil (investor and the cofounder of health technology company Color Genomics, and formerly at Twitter and Google).

This is a wide-ranging survey of some of the major shifts in technology right now, but it’s really a meta-story of how innovation happens, which is most definitely not in a straight line.

So here are the trends they cover:

*crypto (of course);
*AI and machine learning (including GPT-3 – you can also listen to our explainer episode on what’s hype/what’s real there on our show 16 Minutes);
*full stack startups (which Chris first wrote about in 2014);
*and collaborative web/collaborative enterprise/ social (including RTC or real-time communication within the browser), which is where the conversation begins.

Show Notes

WebRTC and WebGL and the move toward collaboration [1:23]

The rise of crypto [5:30] and how new technologies improve over time [8:54]

The current state of AI, with a special focus on GPT-3 [13:57] and how more advanced AI technology may be able to replicate itself [17:25]

Discussion of full-stack startups [21:33] and why there aren’t more of them in the market [25:28]


Collaborative web and the collaborative enterprise; webRTC and webGL

Chris: Elad, you and I have been talking about this. And I know you’re very excited about it, this kind of this feeling that there’s a new stack of web infrastructure, things like video and audio, collaborative video and audio, rather, we sort of have the infrastructure now that it works in a way that it hadn’t in the past. And that’s unlocking a whole new wave of interesting applications.

Elad: People are always looking for the next platform and what the next big platform shift is. And I think it kind of may have snuck up on all of us in the form of WebRTC and WebGL and then related API companies providing sound or other things that then built on top by many other companies. And I think this shift is substantiating itself in two different ways and I almost call it the collaborative web and then separately, the collaborative enterprise.

And if you look back 10 years, people kept talking about during the first social wave, everybody kept talking about how there’s going to be a social enterprise and how every SaaS product was going to be more social and collaborative and that largely failed. And it feels like that shift is finally happening in part due to things like WebGL. You see Figma, for example, is the first really strong example of a WebGL-enabled application, allowing you to collaborate in real-time with other people.

In parallel, WebRTC is really allowing for really interesting concurrent sessions around video. And so, you’re starting to see that in terms of a lot of products being built around virtual office rooms, virtual conference rooms. And I really do think this is the moment where collaboration is finally being built into the enterprise world and enterprise products. And then in parallel, WebGL and WebRTC really seem to be enabling really interesting social experiments right now in terms of new social products.

You have really amazing video and audio quality. So the time lag is gone so you can do things like Clubhouse. We see lots of interesting video experimentation. So you can see almost like degraded forms of VR or other things happening in browser. So, I just think now is a really exciting time of innovation around this new web stack.

Chris: And to your point about sneaking up on us, we’ve obviously had, you know, the ability to have conference calls, group audio for, you know, decades, right? Like, the fact that they’re (now) so low latency and you’ve got like the visual representation of the room means. To me, it’s like, if you remember the old days in the conference calls how you always have people talking over each other partly because of, whatever, 300-millisecond delay?

It’s remarkable how the conversation switches from person to person. It’s the latency. I mean, we’ve all now experienced this with Zoom, right? Like the fact that it doesn’t stutter, the fact that, you know…or very rarely does, like, it’s somehow kind of crossed over this point of good enough.

Elad: We’re finally hitting the point now where in terms of video quality and the ability to stream concurrently across multiple users and in terms of audio quality, we’re hitting that point where the web infrastructure is really supporting the ability to have extremely low latency.

Chris: When you call it a new platform…we’ve cited a few examples. But when you say platform, that means you think there will be thousands of examples or do you think it’s gonna be a whole new wave that goes 5 to 10 years?

Elad: I think like any “platform,” there are going to be a handful of things that really matter that will really be the important things on it. And then a lot of things will be experiments that fail or don’t work. And I don’t know 10 years from now what’s going to be the main set of applications. I just think it is a shift that enables a bunch of new applications to be built, particularly either social or collaborative enterprise.

One example that I think is worth noting in terms of what’s coming due to WebRTC is it’s quite possible that if you look at virtual reality or VR, the predominant use case in the near term may actually shift to the browser. And so, I think right now, in order to experience VR, you need a headset. You need, in some cases, client software, etc.

And so, there’s more obstacles and hurdles to be able to just participate. And I think one of the things I found really interesting about WebRTC and WebGL is the ability to suddenly create VR-like experiences where you just drop any URL and you can show up. And so, the big question in my mind is Oculus almost like the desktop computer versus mobile devices, where the desktop really helps you do powerful tasks but you can do a lot on your phone and it’s sort of the mainstream use case for most of the internet today. So, I think that’s another thing that we’ll see if it happens or doesn’t happen over the next decade. But that may be one interesting long-term trend to watch relative to WebRTC and WebGL.

Crypto, and patterns of innovation adoption

Chris: So, let’s talk about the next trend, crypto. We’re all involved in this. Elad, you invest in crypto. Sep, you’ve co-founded a company, Celo, in crypto. Obviously, I spend most of my time investing in crypto. So, can you tell us a little about why you’re excited about it and the stuff you’re working on in Celo?

Sep: I’ll start off with kind of a general principle that I think is true for all of the technologies that we’re talking about. There are certain class of technologies that increase the expressive range of a certain medium. And when you increase the expressive range of a medium, a lot of things pop up that were not possible before because you now are playing in a new design space.

The historical example that I always love to point to is in the 1800s, the invention of the metal ferrule in painting — the little piece between the paintbrush and the paintbrush handle — and the collapsible easel. Those two things together allowed people to A, bring their paintings outside, and B, start to paint with a new brushstroke that allowed them to quickly dab paint onto the canvas. And those two ended up kind of giving rise to a form of painting that we now know as Impressionism.

And so, it’s interesting to think about that. Impressionism was a result of technological advances in painting. And you see that same thing with the web and the internet in general. There were technological advances in the medium of text. And so, all of a sudden, people could send text more quickly. Anybody could be a broadcaster. You could start putting text together with code to create different things. And that vastly increased the expressive range of text in a way that led to all of these things that you could not predict in advance.

So, for example, in ’94 and ’95 when the web was starting to become popular, one could not imagine that, “Oh, well, one day I’ll be able to press a button and order my groceries on this and have my groceries come to me,” you know? And so, I think those are really interesting from a broad-brush technological point of view.

Why I’m excited about crypto is that crypto does this for money. It increases the expressive range of the technology that we know as money. And that I think will follow very similar to the internet. You know, at the beginning of the internet, you saw it allowed people to pass messages more quickly to one another across a distance in a way that was just qualitatively different than fax.

And that is like the first thing that you started seeing with crypto and it has direct implications to things like remittances or banking the unbanked. But then on top of that, the second implication of the web was that anybody could become a broadcaster. I mean, with YouTube, anybody could have their own TV station. And in the context of crypto, you have the same democratization but in financial services. And so you see this kind of rise in decentralized finance or open finance.

And then third, most exciting, is it allows money to become programmable in the same way that the internet allowed text to become programmable. And that, I think…I mean, we’re seeing some early things today. But that’s, I think, the aspect that we’re still the earliest and it has the most legs. And it is the most powerful and the most difficult to predict at this stage since we’re in such an early phase.

Chris: My framework for this is when there’s a really big breakthrough technology, there’s two stages. And the first stage, you do things you already did but do them better. And the second stage, you do new things you never could do before. And this goes back to the collaborative web stuff we were talking about before. Like, in the first stage, you know, we’re gonna do better video conferencing, right, better audio conferencing and that will probably be a wave that lasts a few years.

And then at some point, people will start to figure out this is a whole new set of things we’ve never done before. Like, the analogy on the web, right, is the first era in the ’90s, people were just kind of putting websites up. They were basically one way. They were brochures and magazines. But then it took another decade to realize there’s things you can just do that you could never do before like social networking, right? It’s multi-way medium, not a one-way medium, right?

It’s similar to my understanding of the history of film. When film started off, you know, they filmed plays, right? And then they realized you could do all these new kind of film-native things, right? And I think crypto will be the same thing. And you hear…the mistake people make is they say, “Oh, great, you can lower payment fees. You can send cross-border payments.” And all of that is true but that’s only phase one, right? Phase two is things we can’t think of, we can’t even imagine.

It’s funny. If you go back and you look at all the ads for mobile phones, like, for 10 years, Nokia and all these folks, they were all trying to convince people to use mobile phones and there’s always stocks, weather, email. There’s literally I think no person in the history of that field that predicted, you know, half the things that we’re using today. So, for me, I think that framework kind of applies whenever there’s a really big breakthrough technology. It just takes a long time to really explore the new design space it was on.

Sep: And, you know, I think one of the reasons for that is a lot of times, the things that are new arise from the things that are old just at scale, at quantity, you know? And that’s actually really interesting because it helps give a framework for predicting things. So you could imagine, for example, blogs were predictable from zines before the internet, you know? But it would be qualitatively different because then you imagine what happens if there’s like thousands and thousands of zines and anybody could access those zines and so on?

And so, then, that kind of starts the creative process going.

Chris: And then, I’ve been directly involved in this, in the infrastructure stuff, people were working on it but it was frankly a little academic until recently. And so, the fact that the applications have taken off so much, and it’s made this scaling problem like a really, really urgent issue. I think it will dramatically accelerate the pace of innovation on the infrastructure side, right? It’s no longer academic, it’s now a very practical problem and there’s real customers and people willing to pay money.

And, you know, the same feedback with…you’ve seen, I think, throughout the history of computing where the app developers on the first iPhone start pushing it to the limit. And that pushes Apple to, you know, go faster and the chip guys to go faster and the whole thing. And then you get that beautiful flywheel that drives everything forward.

Sep: And this is something that’s been very much on our minds as we’ve been developing Celo. So, basically, kind of when we started Celo, the conversation that we were having was, the blockchain reminded us that money is just a technology. And, of course, money has always been a technology. It’s just hard to remember that it’s a technology because its features haven’t changed very much for the past 300 years. But as a technology, its features can change and as a widely used technology, its features have an impact on the society that uses them.

So, I remember when the internet was first getting popular, people were like, “Whoa, you could imagine putting the whole encyclopedia on the internet.” And that was true, but it underestimated the true potentiality of the internet, which was that the encyclopedia would be part of a much richer, much bigger information ecology.

And so, similarly, I see the same thing happening in money, in value. National currencies will continue to exist and continue to be important, but there will also be local currencies, regional currencies, global reference currencies, store-of-value currencies, medium-of-exchange currencies, functional currencies, all interoperating with one another in a rich ecology, not dissimilar to the internet.

We now are starting to have the technology to implement these ideas at scale. But to do a number of these things right, we needed some form of stabilization of cryptocurrency. We needed some methods around identity. We needed advances in light client, and so on. And so, that helped guide the infrastructure that we’re building to enable this.

Chris: It’s going to be an exciting year in crypto. Celo has launched and is continuing to roll things out and a whole bunch of other exciting crypto projects. And so, sort of all of the things that were kind of hatched back in 2017, in ’16, ’17, ’18, are kind of finally all coming out now and it should be really exciting.

Elad: It just seems like that next wave is starting up again too in terms of incrementally new things. Like YFI (Yearn Finance) I feel like just came out of nowhere, for me at least. And so I think that we’re going to see renewed enthusiasm, I think, in crypto in the reasonable near term.

AI, GPT-3, and “hyper-evolutionary” new forms

Chris: Let’s talk a little about AI, sort of the other…it’s amazing right now. I feel like any one of these things would be a major tech trend and we have all of them going on at the same time. So, AI, I don’t personally work on it as a day job but follow it, I guess, as a hobbyist. The big news being GPT-3, which is an algorithm out of OpenAI, which has just shown kind of remarkable results with natural language processing.

And from what we can tell, this is not going to be slowing down. Today the more computers you throw at these kind of neural networks, the smarter they get. And at least at the moment, these systems continue to scale at a pretty healthy rate. So we should see kind of more and more really interesting stuff. Elad, I think you’ve followed this area pretty closely. How are you feeling about it?

Elad: I think GPT-3 is almost like the starting shot for a whole new interesting era in natural language processing or natural language understanding that’s going to take a decade to play out. And I think the historical antecedents or analogs are, back in 2012, there was something known as AlexNet from this guy, Ilya Krzyzewski, which was really the starting shot for machine vision in terms of a shift where that was the first time where you really saw a big step up in performance for a while and that’s really led to everything from face recognition on the iPhone to machine vision in pharma.

Similarly, in 2013, Google switched to recurrent neural networks for speech recognition and then later really did a lot of interesting things in deep reinforcement learning. And that ended up becoming a multi-year precursor to what became things like Amazon Alexa or Echo or a lot of the really good speech recognition technologies we have. And now in 2020, I think similarly, GPT-3 is a natural language analog to these two other key moments in machine learning-based understanding of vision, speech, and now natural language.

I actually think this may be one of the biggest shifts because if you think of how much of the world’s information is embedded in text or how much we communicate in text, this is really the big revolution. And that includes things like enterprise document processing. If you move to natural language, you can start thinking about smart data entry. All the robotic process automation suddenly becomes automated. You can effectively have APIs, in some sense, almost self-construct on top of text in really interesting ways.

There’s things that are very tactical. For example, in your email inbox, all the replies should be auto-generated and then you should just be able to go through and approve them as a person. We’re not there yet, again. It’s a 10-year journey. But, you know, we’ll see things like that. We’ll see legal documents just auto-marked up relative to what your company would normally do. Companies like Clarity are working on early versions of that. If you’re an author and have writer’s block, maybe automatically, you get prompted for three or four different next paragraphs to kick off how you should think about it.

Or in the long run, maybe there’s a whole class of auto fan fiction. So, you know, you really love the novel “Twilight,” and 100 different versions of “Twilight” are spawned. So you don’t have to wait for somebody to come up with “Fifty Shades of Grey.” It just auto-generates, you know, multiple different interesting, you know, fanfic stuff.

On the gaming side, I think you’ll have non-player roles, NPCs, that seem like real people. In health care, maybe you have a mental health specialist who’s really just a robot.

I think this is a really exciting shift and it’s going to take a long time to play out but the technology is finally starting to show hints. Just like in 2012, AlexNet showed hints of what could happen in machine vision. And in 2013, Google showed what could start happening in speech recognition. It feels like this is one of those steps. And so I think it’s significant in terms of a starting shot, although, I think it’s going to take a lot of time to play out.

Sep: I’m really excited about the translation opportunities, in particular the opportunities to translate English to machine understandable code.

Chris: They’ve actually had demos of this with GPT-3, right? Where you describe something and it would actually write the code for you. I haven’t personally tried it but it seems like they’re not canned demos. It really does kind of work.

Sep: And, you know, it’s really straightforward to do that in the context of data structures. You could imagine translating a sentence into a data structure. And it’s not a far step from doing that to natural language querying of SQL. And then it’s not a far step from that to auto-generating code. And so, that’s super exciting to me because you can imagine, there are certain things that are straightforward to build if you know how to program. And they should be straightforward to build if you don’t but it takes kind of ad hoc interfacing to do. Creating a new ERC-20 token, for example, is a pretty straightforward programming task that I can see that someone could use machine translation using any of these technologies, but GPT-3 in particular, to start translating human text to machine text.

Elad: To your point, I think the second that machines can really write and edit code and can spawn instances of themselves and self-replicate, at that point, I think we’re really shifting from a technology to a life form. And I think at that point, you know, we really have this hyper-evolutionary new form of life that’s self-replicating, self-editing. And, you know, one of the interesting things is people always think that a true AGI or self-intelligent agent will come out of Google or Facebook or one of the major companies.

To put these threads together, one could argue maybe where it’s going to really emerge is on the blockchain where you have these really interesting human incentives and competition around something of real value. So you have sort of an optimization metric that’s very crisp when you’re competing to effectively complete financial transactions or contracts and they’re going to get more and more complicated. And so, I think the merger of these two areas will someday happen and it’s going to be fascinating to watch in terms of whether you have this sort of emergent system of self-replicating, self-editing code with strong financial incentives built into it.

If you look at the biology side of things, that replication plus mutability plus selection is really what drove the emergence of intelligence, right? And so, really, the selective function is you need to have a large enough number of different beings or entities. You need them to be able to change at some rate so that they start adapting to their environment they’re being selected for, and then you need that selective pressure.

And when you start having machines be able to edit themselves and to write themselves and replicate themselves at scale, you’re both expanding the number of potential entities that are evolving. But you’re also upping the clock rate. You’re not waiting for a person to write something and test it and then iterate on it and then test it again and try and understand it and theorize and then write more code. You just have systems that are replicating and changing themselves.

And imagine if as a human, you could edit your own DNA and change certain features and experiment with that very rapidly. That’s what’s going to happen in the world of code. And so, I think it’s a long time away, but once code can write itself, I think that’s really when things kick-off for the emergence of a true AGI-based life form.

Chris: There’s no reason this couldn’t be applied for any kind of symbolic system. So for a mathematician, you know, the computer suggests five different proofs. Scientists, the computers suggests five different theories or interpretations or models or whatever it might be. And maybe in the near term, it works alongside a human. Maybe at some point, the machine gets so good, it doesn’t need that. And it’s probably, going back to the framework I was suggesting earlier, the stuff we’re describing falls in the category of doing existing things better. There will probably be crazy new things that we can’t even imagine right now but some developer or entrepreneur will come up with.

Elad: The analog I’ve heard or the analogy I’ve heard for GPT-3 is it’s kind of the clever student who didn’t really study for the exam and half the time kind of bullshits it and half the time knows it. And to your point, GPT-3 can write the next paragraph. The question is, what does GPT-20 look like? GPT-50? As we iterate on these systems, you suddenly have the thing that can really write the fan fiction novel for “Twilight.” So it’ll be really fascinating to watch.

Full-stack startups

Chris: So the last thing we’re going to talk about is what some people call full-stack startups, which is sort of a new way to build startups. Sep, it’s a concept you were interested in. But essentially, the idea is whereas in the old days, software startups mostly stuck to just building software, more and more entrepreneurs are building companies that are sort of software-enabled but also build core capabilities in other areas.

So, just as an example, in fintech, it used to be that the only way you would go to market, you’d build software and try to sell it to a bank or an existing financial institution. Now, more and more, you have these things, like Chime is an example, of an online bank that just sort of bypasses Citibank and goes directly to consumers. It’s an app. You can download it.

Robinhood is another good example. Instead of building software and selling it to Schwab, they just built software and built an app and went directly to the public, right? And this is happening as sort of a new design pattern for startup organizations that’s, I think, letting startups penetrate more and more deeply into industries that had previously kind of resisted software innovation. Sep, I know it’s a topic you’re interested in.

Sep: For the audience, Chris wrote a blog post called “Full-Stack Startups” back in 2014, which is, I’d say, a must-read. It’s one of the most concise and articulate descriptions of this phenomenon that I’ve ever read. I think, basically, kind of full-stack startups were later to emerge than pure software startups for a variety of reasons. Mostly because there was a fair amount of low-hanging fruit in software itself and it is harder to do a full-stack startup because you basically have to start two companies at the same time.

I mean, if you’re starting a full-stack construction company, you have to start a construction company and a software company at the same time. And it’s hard enough just to start either. And full disclosure, I am a co-founder of Mosaic, which is a full-stack construction company. So I’m biased here. But once you’re able to do that, if you’re able to do that, then it allows something really powerful, which is it allows you to write software not just for existing processes, but it allows you to innovate on process at the same time as you innovate on software. And very specifically, it allows you to innovate on process in the way that software enables.

And so, in the same way for crypto, software increases the expressive range of a whole range of things. And that expressive range allows new processes for things like building houses or selling eyeglasses or so on. And it’s really helpful to be able to have a really tight loop between changing the process itself, which is not inherently software-based but new software allows you to do that and then to iterate on the software itself. And so, that kind of opens up an area of innovation that is really difficult to do with either side of the stack alone.

Elad: It seems like there’s a lot of other places where that approach that you mentioned, Sep, really applies. I mean, a company I co-founded that, you know, I haven’t really been operationally involved with for many years is Color Genomics and it’s doing a large proportion of COVID testing in a number of different markets. And a lot of the value, I think, of what the company does on top of just running a vertically integrated lab and other software around it, is all the virtualized care delivery and all the patient interactions, doctor interactions, etc., beyond just, “Hey, can you run a better lab?”

And so I think, to your point, that vertical integration has made a huge difference for Color as an example. And similarly in real estate, not just Mosaic, but it seems like Opendoor, which is literally going in and repainting the interiors of houses as it buys them and things like that and is layering on mortgage and title and everything else of the home purchasing process. It seems like there’s just an enormous amount of innovation in terms of the ability to build something that’s full-stack.

Sep: You know, I mean, it’s interesting. I’ve had a similar question, a little more general from, Chris, the time you wrote that blog post, which was…”Why is there not a flowering of full-stack companies in the same way that there’s a flowering whenever there’s a clear possibility of innovation?” You’re not seeing as many full-stack startups as you are seeing with crypto, for example. I’ve come to two reasons. I imagine there’s more.

The first is that it is really difficult in either context, either in the startup context because it requires kind of an expertise in two very disparate areas as a startup. And I think the second thing is I think it’s tantalizing to take a big industry and, say, dabble on some technology and it will become a full-stack company. But I think you have to have a specific point of view around what the technology is and a real innovation in that technology. And so, I think for those two reasons, I think it’s just like the rewards are great but the difficulty is hard.

Chris: Netflix is a really interesting example, right? So Reed Hastings, obviously a genius, but his prior company to Netflix was Purify, which is a debugger. I mean, it was a very, very technical product. He’s very much a computer scientist, you know? And then, he built…you know, now Netflix is doing all this original content, is becoming more and more dominant in the movie industry. I don’t know…you know, this pattern of having a technologist figure out the other industry, in that case, Hollywood, like, I don’t know if there are examples of the opposite happening of the Hollywood people figuring out the technologies part, you know?

Sep: Yeah. And there’s few examples of Netflix too.

Chris: Sure. And it’s so hard. And that company… I mean, that’s an amazing story, just independently. They had to pivot multiple times while being public. So, you know, and there’s just so few people…

Sep: Right. And he’s a remarkable entrepreneur.

Chris: Yeah. Maybe they’re just so few people like him and Elon Musk and it’s such a hard thing to do. And, you know, raising…it requires a ton of capital and decades of work. So, I think that might be why there’s not more. It’s just really hard.

Sep: For us, the way we were able to do it is Salman, my co-founder, he has a Ph.D. in computer science from MIT but he also grew up in a construction family. So he had deep expertise in construction and deep expertise in computer science both from a young age.

Elad: I think it’s kind of notable or interesting that most of the examples I can think of of really successful full-stack startups are second-time founders. So, with Mosaic, you know, your co-founder, obviously, had an amazing background in terms of family construction and everything else. But the flip of it is you, Sep, had already started, you know, companies before. Elon Musk had two successful outcomes before. Reed Hastings had a successful outcome before. So, it almost feels like you need a stable financial base plus enough know-how in terms of building a company to begin with so that you can take on this extra challenge of doing a second piece of it, you know, to Sep’s point around needing to build two companies at once.

Chris: That’s a good point. Thanks, Sep and Elad. That was awesome.