Who has the advantage in artificial intelligence — big companies, startups, or academia? Perhaps all three, especially as they work together when it comes to fields like this. One thing is clear though: A.I. and deep learning is where it’s at. And that’s why this year’s newly anointed Andreessen Horowitz Distinguished Visiting Professor of Computer Science is Fei-Fei Li [who publishes under Li Fei-Fei], associate professor at Stanford University. Bridging entrepreneurs across academia and industry, we began the a16z Professor-in-Residence program just a couple years ago (most recently with Dan Boneh and beginning with Vijay Pande).
Li is the Director of the Stanford Vision Lab, which focuses on connecting computer vision and human vision; is the Director of the Stanford Artificial Intelligence Lab (SAIL), which was founded in the early 1960s; and directs the new SAIL-Toyota Center for AI Research, which brings together researchers in visual computing, machine learning, robotics, human-computer interactions, intelligent systems, decision making, natural language processing, dynamic modeling, and design to develop “human-centered artificial intelligence” for intelligent vehicles. Li also co-created ImageNet, which forms the basis of the Large Scale Visual Recognition Challenge (ILSVRC) that continually demonstrates drastic advances in machine vision accuracy.
So why now for A.I.? Is deep learning “it”… or what comes next? And what happens as A.I. moves from what Li calls its “in vitro phase” to its “in vivo phase”? Beyond ethical considerations — or celebrating only “geekiness” and “nerdiness” — Li argues we need to inject a stronger humanistic thinking element to design and develop algorithms and A.I. that can co-habitate with people and in social (including crowded) spaces. All this and more on this episode of the a16z Podcast.
Where AI research is today in terms of hardware and algorithms [0:51]
Discussion of creativity and artistic, “generative” intelligence [11:14]
Where startups have opportunity in AI [15:51] and a discussion of self-driving cars, including the ethical issues [19:00]
How AI needs to learn to interact with humans in a socially-acceptable way [24:24]
Adding a humanistic element to AI research to attract more diverse young people [29:11]
Sonal: Hi, everyone. Welcome to the “a16z Podcast.” I’m Sonal, and I’m here today with a16z partner, Frank Chen. And we’re interviewing our newest professor in residence. This is actually the third year — the first year we had ViJay Pande, who’s now the general partner on our Bio Fund, and then we have Dan Boneh. And now, we are so pleased to welcome Dr. Fei-Fei Li, who is the director of the Stanford AI Lab, the Stanford Toyota AI Center, and the Stanford Computer Vision Lab, which is pretty much the most important work happening, at least we think.
Frank: AI is the white hot center of both a lot of startup activity, as well as academic research. And, Fei-Fei, why in the world has it gotten so hot again?
Dr. Li: From my perspective, AI has always been hot. AI is a discipline about 60 years old. In the past 60 years, I call that the “in vitro AI time,” where AI was developed in the laboratories and mostly in research centers. We were laying down the mathematical foundations of AI, we were formulating the questions of AI, and we were testing out the prototypes of AI algorithms. But now, going forward, we’re entering what I call the “AI in vivo time,” [in] which AI is entering real life. So why now? What’s triggering the switch between in vitro to in vivo? I think several things are happening.
First is that AI’s techniques have come of age, but what’s driving that? There are two more very important factors. One is the big data contribution to AI. It’s, you know, the information age. The internet age has brought us big data, and now even boosted by just trillions of sensors everywhere. And the third factor that’s contributing [to] this is the hardware — the computing hardware, the advance of the CPUs, of the GPUs, and the computing clusters. So the convergence of, I’d say, mathematical foundations and statistical machine learning tools, the big data, and the hardware has created this historical moment of AI.
Frank: Why don’t we unpack those in turn? Because I think each one of them in themselves are interesting trends. So why don’t we talk about hardware? We have CPUs, we have GPUs. So it turns out deep learning is great to do on GPUs, because it’s linear algebra and parallelizable. Are we going to see deep learning chips?
Dr. Li: I think so, and I hope so.
Sonal: What would deep learning chips look like? Just obviously, much more — the ability to do much more parallelization, but what does that actually look like? Is it like what’s happening with Nvidia’s chips right now, or something different?
Dr. Li: Nvidia is definitely one of the pioneers in deep learning chips, in the sense [that] their GPUs are highly parallelizable — can handle highly parallelizable operations. And as it turned out, much of the internal operations of a deep learning algorithm, which technically we call neural networks or convolutional neural networks, involves a lot of repeated computation that can be done concurrently. So the GPUs have really contributed a lot in speeding up the contributions, because this can be done in parallel. GPUs are wonderful for training the deep learning algorithms. But I think there is still a lot of space in rapid testing or inference time chips, where it can be used in recognition, you know, in embedded devices. So, I see there is a trend coming up in deep learning chips.
Sonal: So more specialized hardware dedicated for that.
Dr. Li: Yeah.
Frank: Yeah. And we’ve already seen the startups do it like Nervana. Obviously, Google announced the TensorFlow processing unit, right? So they’ve got dedicated silicon as well. So…
Sonal: So GPU to TPU, basically?
Frank: Yeah, exactly. Once you know that you’re going to do something over and over again, then you want it in silicon — both for the performance, and then very importantly on the embedded side, for power consumption. Which is — you want your iPhone eventually to be able to do this.
Dr. Li: But I still think, Frank, that this is a little bit — I wouldn’t say it’s too early, but I think we’re still in the exploratory stage, because the algorithms are not matured enough yet. There’s still a lot of exploration about what to do, in the best way. So, you know, like — this year ICLR, one of the top deep learning conferences — one of its best papers is on [the] particular work coming out of Stanford. Not my lab actually, somebody else’s lab — Professor Bill Dally’s lab — where they’re exploring a sparse algorithm that can enable a specific design of a chip. So, this conjunction of improving algorithms in order to also design the innovative chip is still happening right now.
Sonal: Is that a new thing?
Dr. Li: You mean like algorithms driving the design of the chips…
Sonal: Right, versus the other way around, sort of the chicken egg thing and what comes first.
Frank: Chip design is already so complicated that you have to do it with algorithms. Humans can’t actually lay out chips.
Sonal: Oh, I don’t mean algorithm, like a design. I thought what you were saying was, designing the chip for a particular type of almost universal algorithm, which is how I heard that thing.
Dr. Li: It is designing the chip for a type of algorithm, but it’s a family of algorithms.
Frank: Your argument is that because we’re not sure what the winning algorithms are going to be, we’re still in this very productive period where we’re trying lots and lots of algorithms. It might be too early to design chips, because to put something in hardware, it’s obviously incredibly expensive to get to an ASIC, right? It’s $50 million to tape out. And so unless you’re sure you know what algorithms are gonna run, you can’t optimize the chips for it. Is that…
Dr. Li: Oh, actually, I think it’s really important [that] this thing is happening right now. This R&D has to happen concurrently. It’s just, like Sonal said, there’s a chicken and egg dynamic here, that algorithms affect the way chips are designed, but the constraints of the chips could in turn affect the algorithm. I think this is [the] time to explore this. This is the time to devote resources. Of course, in terms of business model, one has to be careful.
Frank: So the second thing — or another of the three things that you mentioned was that we’ve laid the mathematical foundations for artificial intelligence. And I want to come back to this idea of, look — the hottest thing right now is deep neural networks. But over the 60 years of AI research, we’ve actually used many, many different techniques, right — logical programming. We’ve used planning algorithms, we’ve tried to implement planning algorithms as search algorithms. And so, is deep learning it? Is this what the community has been waiting for or is this just, “Okay, it’s hot now but there’s going to be something else later, too?”
Dr. Li: I get this question a lot — is deep learning the answer to it all? So, first of all, I’m very happy you actually brought up other algorithms and tools. So, if you look at AI’s development, in the very early Minsky, MacArthur days, they used a lot of, you know, first-order logic and expert systems. And those are very much driven by cognitive designs of rules. But what really, I think, was the first AI spring face is the blossoming of machine learning, statistical machine learning algorithms. We’re looking at, you know, boosting algorithms, Bayesian nets, graphical models, support vector machines, regression algorithms, as well as neural networks. So, that whole period — there is about 20, 30 years of blossoming of machine learning algorithms [that] laid the statistical machine learning foundation to today’s AI. And we shouldn’t overlook that.
In fact, many, many industry applications today still use some of the most powerful machine learning algorithms that are not <inaudible>. Deep learning is not the newest. It’s actually developed in the ’60s, ’70s by people like Kunihiko Fukushima, then carried out by Geoff Hinton, and Yann LeCun and their colleagues. I think there [are] some really powerful ingredients of the neural network architecture. It is a very high-capacity model that can take almost any function, and they can do end-to-end training that takes data and all the way to the task objective and optimize on that. But is deep learning it? I think there’s quite a few questions [that] remain that would challenge today’s deep learning architecture and hopefully challenge the entire thinking of AI going forward. One of the more obvious one everybody talks about is supervised versus unsupervised training.
Sonal: And this is I think so important, because a drawback of the current narrative is that it focuses so much on the supervised cases — that we don’t have computers that learn the way children learn.
Dr. Li: Exactly. First of all, we don’t even know much [about] how children learn. There’s a vast body of education, developmental psychology literature, and that’s not getting into computer science yet. You know, supervised learning is powerful when data can be annotated, but it gets very, very hairy when we want to apply a more realistic training scenario. For example, if one day a company builds a little robot that sends to your home, and you want the robot to adapt to tasks that your family wants to do. The best way of training is probably not to open the head of the robot and put in all the annotated data. You want to just, you know, like show and talk about what tasks there is, and have the robot observe and learn. That kind of training scenario, we cannot do in deep learning yet.
But there’s more than just supervised training versus unsupervised training. There is also this whole definition of what is being intelligent, right? Task-driven intelligence is really important, especially for industry. You know, tagging pictures, avoiding pedestrians, speech recognition, transcribing speech, carrying goods. Specific task-driven applications are part of AI and [are] important, but there is also the AGI, artificial general intelligence, of reasoning, abstraction, communication, emotional interaction, understanding of intention and purpose, formulation of knowledge, understanding of context — all this is still largely unknown in terms of how we can get it done.
Sonal: Where would you put creative AI on that list from — okay, there’s the problems that are yet to be solved — unsupervised, supervised, generalized intelligence, and now also creative intelligence?
Dr. Li: Actually, you know, here’s one question we should ask ourselves. What is creativity? If you look at the four, five matches of AlphaGo, there were multiple moments when AlphaGo made a movement — Master Lee Sedol was really surprised. And if you look at the Go community, people were just amazed by the kind of creativity AlphaGo has, in terms of making the moves that most people cannot think of. From that point of view, I think we’re already seeing creativity. Part of creativity is just making right decisions in a somewhat unexpected way. That’s already happening.
Sonal: I’m meant, actually — more interested in the type of creativity where it defies logic, because that’s an example of logical creativity. I’m thinking of something like Jackson Pollock. There is no way a computer is going to waste paint and splatter it, because it’s the most inefficient, irrational thing to possibly do. That’s the kind of creativity I want to know about. I mean, I’m seeing examples of, like, AI-written short films, AI poetry — your own lab, there are people who are writing captions for images. That’s, like, maybe still mechanistic, and Kevin Kelly would even argue that creativity in itself is largely mechanistic, and it’s not as human as we think, anthropomorphic as we think it is — but I really mean like, artistic creativity.
Dr. Li: Yeah, that’s a great question. So interestingly, you already see some of the deep learning work of transferring artistic style. You can put in a Van Gogh painting and turn a photo into that, but I agree that’s very mimicking.
Sonal: Mechanistic.
Dr. Li: Mechanistic. The kind of creativity we’re talking about — blending our logical thinking, emotional thinking, and just, you know, intuitive thinking — and I haven’t seen today’s — any work that builds on the kind of mathematical formulation that would enable that.
Frank: Yeah, it comes back to one of the three things that you use to set up, “Why is AI winning now?” And that is about data, which is if you’re just going to feed the system a bunch of data and then have the neural net train itself, can that ever lead to something that’s truly creative, which isn’t in the data itself?
Dr. Li: Right. Exactly. So, this is…
Sonal: Exactly. Or, maybe it could, by the way, because maybe it can follow the same type of logical arc of history, where you go through a classic phase, a traditionalist phase, an impressionist phase, a post-impressionist phase, an abstract phase. And then you actually go through Jackson Pollock — kind of, Modern Art phase. Like, I almost wonder if you could technically train on that type of history of art and see what happens. I know that’s crazy, and this is completely abstract. And it’s not in any way tied to the actual computer science, but just theoretically.
Frank: We already have systems that can paint in all of those styles, because there was enough in the data so that it could form a classifier that said, “Here’s the style of Van Gogh,” or, “Here’s the style of an impressionist,” and then we can mimic those styles. So, the question is, down that road, using deep learning, can you ever get to breakthrough new things?
Sonal: Right. Generative intelligence. Not general, but generative.
Dr. Li: Generative. So, there’s a lot of thinking on that. We’re pretty far from going from Impressionism to Cubism and all this. But coming back to a more mundane class of work, for example, we are doing computer vision. And some of our work recently is to write a brief captioning or a few caption sentences about images. And then the next thing we did is to start doing Q&A of a picture. And at this point, we start to think, “Can we actually develop algorithms that’s not just learning the training data but learning to learn?”
Sonal: Exactly.
Dr. Li: Learning to ask the right question. For example, we just submitted a paper that is — if we show the computer a picture and ask a question about, “What is the woman doing?” — instead of directly having the computer learn to answer, the computer needs to actually ask a series of questions in order to answer this. So the algorithm needs to — not learning to answer the question directly, but learning to explore the potential space to ask the right question to arrive at the final answer. So the ability of learning to learn is what we want children to have. And this is what we’re exploring in our algorithms.
Sonal: Okay, so then let’s go back for a moment to something you said earlier, Fei-Fei. You know, I really like how you describe that these phases — the, sort of — the in vitro, like, the laboratory phase, and then the in vivo, like, the in-real-life phase. It’s a wonderful way of clumping the work and the moment we’re at, but there’s always been industry and lab and company, you know, collaboration since the beginning of computing. So, what is different now that startups can play in this space, in vivo?
Dr. Li: I think several factors. One is that the algorithms are maturing to the point that industry and startups can use it. You know, 20 years ago, it’s only a few top places in the world, top labs in the world, that hold some algorithms that can do some AI tasks. It’s not percolated to the rest of the industry or rest of the world. So, for any startup, or even company, for that matter, to get their hands-on those algorithms is difficult. But there are also other reasons. Because of the blossoming of [the] internet, because of the blossoming of sensing, we now have more use cases. In order to harness data, we need to manage and understand this information. This created a huge need for intelligent algorithms to do that. So, that’s a use case. Because of sensing, we start to get into scenarios like self-driving, and, like, cars. And now suddenly, we need to create intelligent algorithms to have the cars drive. So, that’s what’s creating this, in my opinion, blossoming.
Frank: The fun thing to watch unfold will be startups versus big established labs and companies. And on the one hand, we’ve got George at comma.ai who built a self-driving car by himself, like, one person. And then on the other side, you’re involved with the SAIL-Toyota Center for AI Research, which is sort of the big industrial approach to this. So, what do you think the relative contributions will be between startups and big organizations?
Dr. Li: In terms of self-driving cars, who is gonna win the self-driving car competition, right? I think the advantages of the big companies are some of the following. A company like Toyota, as soon as they are committed to this, I hope that they put cameras in their cars. They can already get data very quickly, whereas a startup, this is a lot more difficult.
Sonal: The data, again, is the big differential.
Dr. Li: Companies like Google, even though they didn’t have cars at the beginning, they had algorithms. They started this early. So, they now have both data and algorithms.
Sonal: They were a software company first, as opposed to a car company trying to become a software company.
Dr. Li: Exactly. The software is such an important part. They actually have an edge there. What about startups? Do they still have an edge? I think there is a lot of business scenarios that might be not so critical on the path for these big companies. But the startup can come in through a more niche area, or more vertical space, and build up their data and algorithm that way. Or, the startup company can do what Mobileye does. Instead of building the entire system — [the] entire car — they build one critical component that’s better than anybody else. And that’s another angle they can come in.
Frank: Your colleague, Andrew Ng, who used to be at Stanford and now runs the AI lab at Baidu, has called Tesla’s autopilot system irresponsible, because it got into a crash. Because there are well known scenarios, basically, where the system wouldn’t perform safely. And so, Andrew said, “Look, it’s premature.” So, I wanted to get your thoughts on this, especially since you’re involved with the Toyota program.
Dr. Li: So, when Tesla’s autopilot came out, I watched some of the YouTube videos. As a mom, I would never want to put my kids or myself into those cars. So, from that point of view, I did, kind of, react — you know, squeamishly on that. But what I’m hoping here, is a really clear communication strategy between the business and the consumers. I don’t have a Tesla, so I don’t know what Tesla told the users. But if the communication is extremely clear about when you should trust the system, and when you should use it, when you shouldn’t — then we get into a situation, you know, when customers are not doing the right thing — who is to blame? And we’re getting more and more into that in AI and ethics — is that, who is to blame? Because every single machine, if used in a wrong way, would have its very scary consequences. I think that’s a societal conversation we need to be having.
Sonal: Yet another example of how technologists and technology needs marketing. I mean, we tell our company CEOs all the time about the importance of these functions. It just continually reinforces that.
Frank: Yeah, marketing and training and the right user experience.
Sonal: Right design.
Frank: So, this is going to be one of the hardest areas to design for, which is, if we’re on this continuum somewhere between intelligence augmentation and full autonomy, how do you design a system so that the driver knows, “Oh, it’s time for you to pay attention to again, because I don’t know what to do.” Does the steering wheel vibrate? Is there an auditory cue? Like, these are gonna be tricky systems to design.
Sonal: I agree. And I think this is actually where there is a really important conversation to be had. Nissan has an anthropologist on staff, Dr. Melissa Cefkin. I forgot how to pronounce her last name, but she’s an anthropologist whose full-time job is to study these issues in order to build it into the actual design. And it’s not just, like, software engineers who are designing this. It’s a conversation to be had.
Dr. Li: In our Stanford-Toyota Center, this center has a group of professors working on different projects. And there is one big project that is led by [the] Human-Computing Interaction Group.
Sonal: It’s HCI, right?
Dr. Li: Yeah, it’s HCI because of this.
Frank: Yeah, it’s great to see, sort of, anthropologists, maybe philosophers come back into the mix, because these complex systems — you’d really want the full 360 degree view of design. It’s not just what technology enables, but what are human expectations around it.
Dr. Li: And one thing to really keep in mind. Compared to computers, humans are extremely slow computing machines. The information transfer in our brain is very slow compared to transistors and, add on top of that, our motor system — you know, from our brain to our muscles — is even slower. So, when we are talking about human machine interaction and split second decision making, we should really factor in that.
Frank: Yeah, it sort of brings to mind the famous trolley problem. You knew I was gonna here, Sonal, right? Because I can’t help bringing this up.
Sonal: And I edited Patrick Lin, who is, like, a long time thinker in this space. And he…
Frank: Yeah. And the YouTube video that Patrick created is great. So, if you want to sort of see the full exposition, go see his YouTube video. But in summary, the challenge is this — humans are slow. And so, if you get into an accident because your response time was too slow, you’re definitely not liable, right? Like, you just couldn’t control the car breaking in front of you. An autonomous car can actually make a decision. So, imagine that you’re an autonomous car, and then your algorithm needs to decide, “All right, the truck in front of me suddenly braked. I could plow myself into the back of the truck and injure my passengers, or I could swerve to the right and maybe take out the motorcyclist, or I could soar to the left and hit a minivan.” The computer will need to make an explicit decision. And it has the reaction time to actually make an explicit decision. And so, if that decision is explicit, can it be held liable? Can the designer of that algorithm be held liable, because it made an explicit decision rather than having a split second response.
Sonal: When people bring up the trolley example, it gets really frustrated, because it’s so abstract. But I actually think that the act of going through this thought process is exactly what gets you to answering these questions that you’re asking about the liability — who’s accountable, the emotional tradeoffs that we make, and how to understand even our own limitations, as you point out, Fei-Fei,
Dr. Li: This actually brings up the topic that in the past few years, I’ve been really advocating in the education and research of AI. We need to inject a strong humanistic thinking element into this, because our technology is more and more in vivo. It’s touching people as real lives. And how do we think and develop and design algorithms that can, you know — hopefully better humans’ lives, but really have to cohabitate with humans. We need that kind of humanistic thinking.
Sonal: I actually want to ask about a paper that you guys recently just put out. I actually included it in our last newsletter. It was about autonomous cars navigating social spaces. So interesting, because this is lab research in the wild. This is no longer — you know, we can have these algorithms work perfectly fine. But to have them navigate — I’m thinking of streets like in India where, you know, there will be a cow and like 10 buffaloes behind you in the middle of all this, and I don’t know any computer that’s accounting for that. So, I’d love to hear how you guys came to that paper and some of the thinking.
Dr. Li: This is a project [where] the main PI is Silvio Savrese. It’s the social robot they created called Jackrabbot, so to honor California’s jackrabbit. And the purpose of Jackrabbot is an autonomous driving robot or vehicle that’s taking care of what we call the last miles of driving, where it tends to be in much more social spaces rather than highways. You know, sidewalks, busy cities, campuses, airports, and all this. So, when we look at the problem of last miles of driving, or just the social space, we quickly realize the problem is — you know, not only you have to do everything that a highway driving car needs to be doing to understand the layout of the seeing the pedestrians, the lanes, and all this — you also have to navigate in a way that is courteous and acceptable to people.
So, one naive solution, people say, “Well, you know, just make a really low speed and stop whenever there’s people.” We tested that. If we do that, the robot will never go anywhere. Because in a very crowded space, there’s always people. If the robot just follows the most naive rule of, “I’ll yield to people all the time,” the robot would just be sitting there from the starting point and not getting anywhere.
Sonal: Frankly, if that robot was used in San Francisco, it would be kicked, too, probably a couple of times. Maybe people will be really irritated about it — or New York, they’d be irritated in Time Square about it moving so slowly.
Dr. Li: Yeah, right. So, we thought about that, and we haven’t thought about, you know, what to do yet. We think with — the robot has to have an SOS kind of call. So, what we want to do is to create a robot that understands human social dynamics. So, it can carry it’s on task, for example, going from A to B to deliver something on campus, but do it in a courteous way. So, we started to first record human behavior by data on campus and look at how people gathered together when they talk in small groups, or how they walk — especially, you know, 9:00 [on the] Stanford campus, there’s so many students going into so many classes. But they’re not going in a completely random way. They tend to form interesting patterns, depending on the direction they’re going.
So, we gather all this data, we feed it into the algorithm. Have the algorithm learn about this — especially from injecting some social rules, such as, people tend to follow others in going the same direction. You do not break two people or several people when they’re talking. So, we injected all these and learned the right way of doing it. And then we put it into the algorithm. And then the algorithm started to learn by itself how to navigate.
Sonal: Just to probe on that — how to navigate, not how to learn those social cues itself.
Dr. Li: Right. How to navigate. We give them some social cues, but we only give them high level cues, The detail, for example — the algorithm still has to learn, “When I avoid two people talking, how far do I avoid? Do I avoid them by 10 feet or two feet?” These are the things that are learned just by observing.
Sonal: Have there been any new surprises yet for you guys, out of this?
Dr. Li: No. Sorry.
Frank: When I read the paper, the question that immediately came to mind for me — which is that social norms vary from place to place.
Sonal: That’s what I was thinking too, the cross cultural aspect, especially.
Frank: And so when we ship these robots that observe social norms, is this going to be the new localization? In other words, here’s the self-navigating robot, Mumbai edition. Here’s a self-navigating robot, Boston edition.
Dr. Li: Excellent question. So, my answer to that is, as of now, we have to train them location by location. We have to gather data. But, as I was saying earlier, you know, the next dream I would have is to teach robots how to learn — learning to learn — rather than just to mimic training data. At that point, it should be online learning. It should be incremental learning so that the robot can adapt to different…
Frank: Right. So you wouldn’t have to train it on a particular city’s actual traffic patterns. You just drop it in there and the robot will figure it out.
Dr. Li: Exactly.
Sonal: Like the way humans do when you travel like to be — when in Rome, do as the Romans do, so to speak. I mean, I come from the world of developmental psychology, and the development of moral and social mores requires not just a regular cognition, but a metacognition and an awareness of your own thinking — that is a whole new layer that it just complicates things. So it’s super fascinating. Okay, so I want to go back, then, to something you said, Fei-Fei, about this humanistic side of things. Tell us more about what you’re thinking when you say that. Like, do you mean that we should be injecting humanities into computer science, or art — like, you know, I’ve heard of this move from STEM to STEAM. Like, what are you actually talking about when you say that?
Dr. Li: So, here’s where it all came from. About three years ago, I was thinking — I was observing that in my professional life, there are two crises people tend to talk about, and they seem to be completely disconnected, these two crises. The first crisis is that terminators are coming next door, and AI’s are turning evil, and all this. We’re summoning evil, and AI is gonna just one day rule us all. That’s one crisis. Another crisis we hear here also is about the lack of diversity in STEM, and computing. And from where I stand, the total lack of diversity in AI. And it dawned on me that these two crises are actually connected by a very important hypothesis, which is the lack of humanistic thinking and humanistic mission statement in the education and development of our technology.
So, let’s look at the first one. Why do we ever think technology might turn evil? Well, technologies are always in the hands of people. Technologies themselves are neutral. You know, be it nuclear weapons or nuclear physics, or just a knife, you know, that can cut [an] apple — you know, in the hands of people, technology can have consequences. So, in order to have responsible and benevolent technology, what we really want is to have a society, have a group of technologists, who have the humanistic awareness and thinking — so that we can use technology responsibly. So, that’s related to the first thing. The second thing is, why are we not — millions and millions and millions of dollars are put into attracting diversity into computing and STEM. And where I stand, I find it very hard to convince women and underrepresented minorities to work in AI.
Sonal: This is, by the way — despite being at Stanford, which has, what, 50/50 parity in the computer science program with women and men?
Dr. Li: Oh, no, it’s not 50/50. It’s about 25% to 30%, in undergraduate that we have women. And then this thing just goes down as you…
Sonal: Oh, goes down as you go higher. Okay.
Dr. Li: Oh, yeah. The attrition at every stage is grim. And so, looking at Stanford students, they’re extremely talented. Almost any student coming to Stanford, whether it’s an undergrad or a Ph.D, they’re talented enough to be analytical, but also have, you know, great writing skills, care about the world. I suddenly realized here, in our field, as well as Silicon Valley, we’re not sending the right messages to attract people of all walks of life.
Sonal: What do you mean by that?
Dr. Li: We tend to just celebrate geekiness, nerdiness. But when you have an ambitious young woman coming into our department, or into the AI lab, she might be thinking about the aging society. She might be thinking about curing cancer. She might be thinking about a lot of socially important topics. If we present ourselves just as geeks loving to do geeky things, we’re missing a huge demography who actually want to turn technology into [a] humanistic mission. So then, suddenly, I realized, we’re missing [a] huge opportunity attracting diversity, because we’re not talking enough or thinking enough of [the] humanistic mission in AI. And that united my two themes I’ve been thinking about.
Sonal: Just to put a sharp and a point on this. I don’t want to be cliché about “only women and underrepresented minorities would take on ‘the soft problems,’” because there are also other people who might want to take on those challenges of aging, and some of the other interesting shifts that are happening. But to your point, we’re not necessarily inclusive enough — we’re not thinking about this enough, period, regardless of background — to be able to really welcome that type of thinking.
Dr. Li: I think it’s all walks of life. They come with their experiences and value systems.
Sonal: That’s fair.
Dr. Li: The one thing I start to notice. I have a lot of friends who are extremely successful Silicon Valley entrepreneurs and technologists. And, given my own age, all of my friends — many of them are entering the age that they have aging parents.
Sonal: Yes, this is so top of mind.
Dr. Li: Suddenly they’re talking about health care.
Sonal: Which they never did before.
Dr. Li: When they were [in their] 20s, they’re thinking about beers. You know, they’re not talking about health care. Yeah.
Sonal: Your point is that having that access to that experience is really important to that perspective.
Dr. Li: Right. So all walks of life add to our collective thinking and creativity…
Sonal: Right. It’s a great point.
Dr. Li: …in our technology.
Frank: I know one of the things that your lab does is an outreach to high school girls who come to campus for two weeks.
Dr. Li: This is the brainchild of me and my former student, Dr. Olga Russakovsky. Our hypothesis is, let’s catch girls at the age that they’re starting to think of who they are and what they want to do. And we find the age group of high school freshmen to sophomore thinking about what they want to focus on. So, we created this AI. camp that specifically — we aim for two things. One is, we want to be very technical because we want to get — inspire the future leaders of AI, and talented math and computing students. But we want to attract these students who otherwise might not think of AI, because they didn’t know such a strong humanistic mission is in AI. We actually [ran] a very rigorous hypothesis testing over the summer and wrote a technical paper about this.
Sonal: I like this approach, by the way, because I get really tired of hearing all the different “camp for this, camp for that, program for this, program for that,” and I feel like, “Come on, guys, are we really solving the problem?” It’s kind of refreshing to hear that you’re taking a much more rigorous approach to it.
Dr. Li: Right. So our campus designed — in the morning, the students go through rigorous lectures and work with the TA’s and Ph.D students and postdocs on the technical problems of AI. In the afternoon, the girls were divided into four research groups. And each of the research projects is a technical AI project — for example, computer vision or NLP or computational biology. But we put a very strong humanistic statement into each of the projects. For example, last year, we had four projects. The computer vision project uses depth sensors to look at hospital environments and help doctors and nurses to monitor hand hygiene scenarios. The NLP natural language project uses Twitter data during natural disasters, for example, earthquakes to — the girls’ aim is to do the right data mining to find messages that help to do disaster relief. And the self-driving car project, we designed an aging problem of a senior that needs to retrieve drops…
Sonal: That’s amazing.
Dr. Li: …and go there and come back. So, everything is very technical, but suddenly they learn that they connect these technologies to humanistic purposes. We have a team of three researchers. Two undergrad, one Ph.D student, and myself — we conducted a rigorous evaluation project on this hypothesis, can humanism increase the interest in AI? And we found a statistically significant difference from the beginning to — before and after for these girls’ thinking. And that particular paper is published in the computer science education conference to show this makes a difference.
Sonal: That’s great. It’ll be interesting to see what happens when you expand that to other groups.
Dr. Li: Yeah, we’re running it again this year. And we really hope that this can become a continuous program.
Sonal: Okay. Well, Fei-Fei, I’m excited to have you join us and bring all these perspectives to our own firm and the entrepreneurs we work with. And we’re so excited. Thank you for joining.
Dr. Li: Thank you.
The a16z Podcast discusses the most important ideas within technology with the people building it. Each episode aims to put listeners ahead of the curve, covering topics like AI, energy, genomics, space, and more.