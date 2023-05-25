Research in artificial intelligence is increasing at an exponential rate. It’s difficult for AI experts to keep up with everything new being published, and even harder for beginners to know where to start.

So, in this post, we’re sharing a curated list of resources we’ve relied on to get smarter about modern AI. We call it the “AI Canon” because these papers, blog posts, courses, and guides have had an outsized impact on the field over the past several years.

We start with a gentle introduction to transformer and latent diffusion models, which are fueling the current AI wave. Next, we go deep on technical learning resources; practical guides to building with large language models (LLMs); and analysis of the AI market. Finally, we include a reference list of landmark research results, starting with “Attention is All You Need” — the 2017 paper by Google that introduced the world to transformer models and ushered in the age of generative AI.

These articles require no specialized background and can help you get up to speed quickly on the most important parts of the modern AI wave.

Software 2.0 : Andrej Karpathy was one of the first to clearly explain (in 2017!) why the new AI wave really matters. His argument is that AI is a new and powerful way to program computers. As LLMs have improved rapidly, this thesis has proven prescient, and it gives a good mental model for how the AI market may progress.

State of GPT : Also from Karpathy, this is a very approachable explanation of how ChatGPT / GPT models in general work, how to use them, and what directions R&D may take. : Also from Karpathy, this is a very approachable explanation of how ChatGPT / GPT models in general work, how to use them, and what directions R&D may take.

What is ChatGPT doing … and why does it work? : Computer scientist and entrepreneur Stephen Wolfram gives a long but highly readable explanation, from first principles, of how modern AI models work. He follows the timeline from early neural nets to today’s LLMs and ChatGPT.

Transformers, explained : This post by Dale Markowitz is a shorter, more direct answer to the question “what is an LLM, and how does it work?” This is a great way to ease into the topic and develop intuition for the technology. It was written about GPT-3 but still applies to newer models.

How Stable Diffusion works : This is the computer vision analogue to the last post. Chris McCormick gives a layperson’s explanation of how Stable Diffusion works and develops intuition around text-to-image models generally. For an even gentler introduction, check out this comic from r/StableDiffusion.

Foundational learning: neural networks, backpropagation, and embeddings Foundational learning: neural networks, backpropagation, and embeddings

These resources provide a base understanding of fundamental ideas in machine learning and AI, from the basics of deep learning to university-level courses from AI experts.

Courses

Stanford CS229 : Introduction to Machine Learning with Andrew Ng, covering the fundamentals of machine learning.

Stanford CS224N : NLP with Deep Learning with Chris Manning, covering NLP basics through the first generation of LLMs.

Tech deep dive: understanding transformers and large models Tech deep dive: understanding transformers and large models

There are countless resources — some better than others — attempting to explain how LLMs work. Here are some of our favorites, targeting a wide range of readers/viewers.

Courses

Stanford CS25 : Transformers United, an online seminar on Transformers.

Stanford CS324 : Large Language Models with Percy Liang, Tatsu Hashimoto, and Chris Re, covering a wide range of technical and non-technical aspects of LLMs.

Practical guides to building with LLMs Practical guides to building with LLMs

A new application stack is emerging with LLMs at the core. While there isn’t a lot of formal education available on this topic yet, we pulled out some of the most useful resources we’ve found.

LLM Bootcamp : A practical course for building LLM-based applications with Charles Frye, Sergey Karayev, and Josh Tobin.

Hugging Face Transformers : Guide to using open-source LLMs in the Hugging Face transformers library.

Chatbot Arena : An Elo-style ranking system of popular LLMs, led by a team at UC Berkeley. Users can also participate by comparing models head to head.

Open LLM Leaderboard : A ranking by Hugging Face, comparing open source LLMs across a collection of standard benchmarks and tasks.

Market analysis Market analysis

We’ve all marveled at what generative AI can produce, but there are still a lot of questions about what it all means. Which products and companies will survive and thrive? What happens to artists? How should companies use it? How will it affect literally jobs and society at large? Here are some attempts at answering these questions.

Landmark research results Landmark research results

Most of the amazing AI products we see today are the result of no-less-amazing research, carried out by experts inside large companies and leading universities. Lately, we’ve also seen impressive work from individuals and the open source community taking popular projects into new directions, for example by creating automated agents or porting models onto smaller hardware footprints.

Here’s a collection of many of these papers and projects, for folks who really want to dive deep into generative AI. (For research papers and projects, we’ve also included links to the accompanying blog posts or websites, where available, which tend to explain things at a higher level. And we’ve included original publication years so you can track foundational research over time.)

Special thanks to Jack Soslow, Jay Rughani, Marco Mascorro, Martin Casado, Rajko Radovanovic, and Vijay Pande for their contributions to this piece, and to the entire a16z team for an always informative discussion about the latest in AI. And thanks to Sonal Chokshi and the crypto team for building a long series of canons at the firm.