Infra

AI Canon

Derrick Harris, Matt Bornstein, and Guido Appenzeller Posted May 25, 2023
AI Canon

Research in artificial intelligence is increasing at an exponential rate. It’s difficult for AI experts to keep up with everything new being published, and even harder for beginners to know where to start.

So, in this post, we’re sharing a curated list of resources we’ve relied on to get smarter about modern AI. We call it the “AI Canon” because these papers, blog posts, courses, and guides have had an outsized impact on the field over the past several years.

We start with a gentle introduction to transformer and latent diffusion models, which are fueling the current AI wave. Next, we go deep on technical learning resources; practical guides to building with large language models (LLMs); and analysis of the AI market. Finally, we include a reference list of landmark research results, starting with “Attention is All You Need”—the 2017 paper by Google that introduced the world to transformer models and ushered in the age of generative AI.

A gentle introduction…

These articles require no specialized background and can help you get up to speed quickly on the most important parts of the modern AI wave.

  • Software 2.0: Andrej Karpathy was one of the first to clearly explain (in 2017!) why the new AI wave really matters. His argument is that AI is a new and powerful way to program computers. As LLMs have improved rapidly, this thesis has proven prescient, and it gives a good mental model for how the AI market may progress.
  • State of GPT: Also from Karpathy, this is a very approachable explanation of how ChatGPT / GPT models in general work, how to use them, and what directions R&D may take.
  • What is ChatGPT doing … and why does it work?: Computer scientist and entrepreneur Stephen Wolfram gives a long but highly readable explanation, from first principles, of how modern AI models work. He follows the timeline from early neural nets to today’s LLMs and ChatGPT.
  • Transformers, explained: This post by Dale Markowitz is a shorter, more direct answer to the question “what is an LLM, and how does it work?” This is a great way to ease into the topic and develop intuition for the technology. It was written about GPT-3 but still applies to newer models.
  • How Stable Diffusion works: This is the computer vision analogue to the last post. Chris McCormick gives a layperson’s explanation of how Stable Diffusion works and develops intuition around text-to-image models generally. For an even gentler introduction, check out this comic from r/StableDiffusion.

Foundational learning: neural networks, backpropagation, and embeddings

These resources provide a base understanding of fundamental ideas in machine learning and AI, from the basics of deep learning to university-level courses from AI experts.

Explainers

Courses

  • Stanford CS229: Introduction to Machine Learning with Andrew Ng, covering the fundamentals of machine learning.
  • Stanford CS224N: NLP with Deep Learning with Chris Manning, covering NLP basics through the first generation of LLMs.

Tech deep dive: understanding transformers and large models

There are countless resources—some better than others—attempting to explain how LLMs work. Here are some of our favorites, targeting a wide range of readers/viewers.

Explainers

Courses

  • Stanford CS25: Transformers United, an online seminar on Transformers.
  • Stanford CS324: Large Language Models with Percy Liang, Tatsu Hashimoto, and Chris Re, covering a wide range of technical and non-technical aspects of LLMs.

Reference and commentary

  • Predictive learning, NIPS 2016: In this early talk, Yann LeCun makes a strong case for unsupervised learning as a critical element of AI model architectures at scale. Skip to 19:20 for the famous cake analogy, which is still one of the best mental models for modern AI.
  • AI for full-self driving at Tesla: Another classic Karpathy talk, this time covering the Tesla data collection engine. Starting at 8:35 is one of the great all-time AI rants, explaining why long-tailed problems (in this case stop sign detection) are so hard.
  • The scaling hypothesis: One of the most surprising aspects of LLMs is that scaling—adding more data and compute—just keeps increasing accuracy. GPT-3 was the first model to demonstrate this clearly, and Gwern’s post does a great job explaining the intuition behind it.
  • Chinchilla’s wild implications: Nominally an explainer of the important Chinchilla paper (see below), this post gets to the heart of the big question in LLM scaling: are we running out of data? This builds on the post above and gives a refreshed view on scaling laws.
  • A survey of large language models: Comprehensive breakdown of current LLMs, including development timeline, size, training strategies, training data, hardware, and more.
  • Sparks of artificial general intelligence: Early experiments with GPT-4: Early analysis from Microsoft Research on the capabilities of GPT-4, the current most advanced LLM, relative to human intelligence.
  • The AI revolution: How Auto-GPT unleashes a new era of automation and creativity: An introduction to Auto-GPT and AI agents in general. This technology is very early but important to understand—it uses internet access and self-generated sub-tasks in order to solve specific, complex problems or goals.
  • The Waluigi Effect: Nominally an explanation of the “Waluigi effect” (i.e., why “alter egos” emerge in LLM behavior), but interesting mostly for its deep dive on the theory of LLM prompting.

Practical guides to building with LLMs

A new application stack is emerging with LLMs at the core. While there isn’t a lot of formal education available on this topic yet, we pulled out some of the most useful resources we’ve found.

Reference

  • Build a GitHub support bot with GPT3, LangChain, and Python: One of the earliest public explanations of the modern LLM app stack. Some of the advice in here is dated, but in many ways it kicked off widespread adoption and experimentation of new AI apps.
  • Building LLM applications for production: Chip Huyen discusses many of the key challenges in building LLM apps, how to address them, and what types of use cases make the most sense.
  • Prompt Engineering Guide: For anyone writing LLM prompts—including app devs—this is the most comprehensive guide, with specific examples for a handful of popular models. For a lighter, more conversational treatment, try Brex’s prompt engineering guide.
  • Prompt injection: What’s the worst that can happen? Prompt injection is a potentially serious security vulnerability lurking for LLM apps, with no perfect solution yet. Simon Willison gives the definitive description of the problem in this post. Nearly everything Simon writes on AI is outstanding.
  • OpenAI cookbook: For developers, this is the definitive collection of guides and code examples for working with the OpenAI API. It’s updated continually with new code examples.
  • Pinecone learning center: Many LLM apps are based around a vector search paradigm. Pinecone’s learning center—despite being branded vendor content—offers some of the most useful instruction on how to build in this pattern.
  • LangChain docs: As the default orchestration layer for LLM apps, LangChain connects to just about all other pieces of the stack. So their docs are a real reference for the full stack and how the pieces fit together.

Courses

  • LLM Bootcamp: A practical course for building LLM-based applications with Charles Frye, Sergey Karayev, and Josh Tobin.
  • Hugging Face Transformers: Guide to using open-source LLMs in the Hugging Face transformers library.

LLM benchmarks

  • Chatbot Arena: An Elo-style ranking system of popular LLMs, led by a team at UC Berkeley. Users can also participate by comparing models head to head.
  • Open LLM Leaderboard: A ranking by Hugging Face, comparing open source LLMs across a collection of standard benchmarks and tasks.

Market analysis

We’ve all marveled at what generative AI can produce, but there are still a lot of questions about what it all means. Which products and companies will survive and thrive? What happens to artists? How should companies use it? How will it affect literally jobs and society at large? Here are some attempts at answering these questions.

a16z thinking

  • Who owns the generative AI platform?: Our flagship assessment of where value is accruing, and might accrue, at the infrastructure, model, and application layers of generative AI.
  • Navigating the high cost of AI compute: A detailed breakdown of why generative AI models require so many computing resources, and how to think about acquiring those resources (i.e., the right GPUs in the right quantity, at the right cost) in a high-demand market.
  • Art isn’t dead, it’s just machine-generated: A look at how AI models were able to reshape creative fields—often assumed to be the last holdout against automation—much faster than fields such as software development.
  • The generative AI revolution in games: An in-depth analysis from our Games team at how the ability to easily create highly detailed graphics will change how game designers, studios, and the entire market function. This follow-up piece from our Games team looks specifically at the advent of AI-generated content vis à vis user-generated content.
  • For B2B generative AI apps, is less more?: A prediction for how LLMs will evolve in the world of B2B enterprise applications, centered around the idea that summarizing information will ultimately be more valuable than producing text.
  • Financial services will embrace generative AI faster than you think: An argument that the financial services industry is poised to use generative AI for personalized consumer experiences, cost-efficient operations, better compliance, improved risk management, and dynamic forecasting and reporting. 
  • Generative AI: The next consumer platform: A look at opportunities for generative AI to impact the consumer market across a range of sectors from therapy to ecommerce.
  • To make a real difference in health care, AI will need to learn like we do: AI is poised to irrevocably change how we look to prevent and treat illness. However, to truly transform drug discovery to care delivery, we should invest in creating an ecosystem of “specialist” AIs—that learn like our best physicians and drug developers do today.
  • The new industrial revolution: Bio x AI: The next industrial revolution in human history will be biology powered by artificial intelligence.

Other perspectives


Landmark research results

Most of the amazing AI products we see today are the result of no-less-amazing research, carried out by experts inside large companies and leading universities. Lately, we’ve also seen impressive work from individuals and the open source community taking popular projects into new directions, for example by creating automated agents or porting models onto smaller hardware footprints. 

Here’s a collection of many of these papers and projects, for folks who really want to dive deep into generative AI. (For research papers and projects, we’ve also included links to the accompanying blog posts or websites, where available, which tend to explain things at a higher level. And we’ve included original publication years so you can track foundational research over time.)

Large language models

New models

Model improvements (e.g. fine-tuning, retrieval, attention)

Image generation models

Agents

Other data modalities

Code generation

Video generation

Human biology and medical data

Audio generation

Multi-dimensional image generation

Special thanks to Jack Soslow, Jay Rughani, Marco Mascorro, Martin Casado, Rajko Radovanovic, and Vijay Pande for their contributions to this piece, and to the entire a16z team for an always informative discussion about the latest in AI. And thanks to Sonal Chokshi and the crypto team for building a long series of canons at the firm.

About the Contributors
Want More a16z Infra?

Analysis and news covering the latest trends reshaping AI and infrastructure.

Learn More
Recommended For You

Want More Infra?

Analysis and news covering the latest trends reshaping AI and infrastructure.

Sign Up On Substack

Views expressed in “posts” (including podcasts, videos, and social media) are those of the individual a16z personnel quoted therein and are not the views of a16z Capital Management, L.L.C. (“a16z”) or its respective affiliates. a16z Capital Management is an investment adviser registered with the Securities and Exchange Commission. Registration as an investment adviser does not imply any special skill or training. The posts are not directed to any investors or potential investors, and do not constitute an offer to sell — or a solicitation of an offer to buy — any securities, and may not be used or relied upon in evaluating the merits of any investment.

The contents in here — and available on any associated distribution platforms and any public a16z online social media accounts, platforms, and sites (collectively, “content distribution outlets”) — should not be construed as or relied upon in any manner as investment, legal, tax, or other advice. You should consult your own advisers as to legal, business, tax, and other related matters concerning any investment. Any projections, estimates, forecasts, targets, prospects and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Any charts provided here or on a16z content distribution outlets are for informational purposes only, and should not be relied upon when making any investment decision. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. In addition, posts may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein. All content speaks only as of the date indicated.

Under no circumstances should any posts or other information provided on this website — or on associated content distribution outlets — be construed as an offer soliciting the purchase or sale of any security or interest in any pooled investment vehicle sponsored, discussed, or mentioned by a16z personnel. Nor should it be construed as an offer to provide investment advisory services; an offer to invest in an a16z-managed pooled investment vehicle will be made separately and only by means of the confidential offering documents of the specific pooled investment vehicles — which should be read in their entirety, and only to those who, among other requirements, meet certain qualifications under federal securities laws. Such investors, defined as accredited investors and qualified purchasers, are generally deemed capable of evaluating the merits and risks of prospective investments and financial matters.

There can be no assurances that a16z’s investment objectives will be achieved or investment strategies will be successful. Any investment in a vehicle managed by a16z involves a high degree of risk including the risk that the entire amount invested is lost. Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by a16z is available here: https://a16z.com/investments/. Past results of a16z’s investments, pooled investment vehicles, or investment strategies are not necessarily indicative of future results. Excluded from this list are investments (and certain publicly traded cryptocurrencies/ digital assets) for which the issuer has not provided permission for a16z to disclose publicly. As for its investments in any cryptocurrency or token project, a16z is acting in its own financial interest, not necessarily in the interests of other token holders. a16z has no special role in any of these projects or power over their management. a16z does not undertake to continue to have any involvement in these projects other than as an investor and token holder, and other token holders should not expect that it will or rely on it to have any particular involvement.

With respect to funds managed by a16z that are registered in Japan, a16z will provide to any member of the Japanese public a copy of such documents as are required to be made publicly available pursuant to Article 63 of the Financial Instruments and Exchange Act of Japan. Please contact compliance@a16z.com to request such documents.

For other site terms of use, please go here. Additional important information about a16z, including our Form ADV Part 2A Brochure, is available at the SEC’s website: http://www.adviserinfo.sec.gov.