Howdy, Sarah here!

This is a ~10 minute summary of the 2.5 hour tutorial “Application Development Using Large Language Models” given by Andrew Ng and Open AI’s Isa Fulford at NeurIPS (Dec 11, 2023).

I recently attended with Metaphor and managed to get a spot at the talk haha. This document is based on the notes I took and polished with a good amount of restructuring, additions, and diagrams. It also includes the Prompt Engineering notes I shared a few days ago.

Additionally, a special thanks to contributors Swyx, Jerry Liu, Kudzo Ahegbebu and Brian Huang for their input, feedback, and comments.

look how crowded! every inch of floor space and walkway was filled 😱

While the talk is not available online, these comprehensive but quick notes hopefully provide a good, quick summary :) Enjoy!

Table of Contents:

LLMs: The Basics

LLMs are a reasoning engine.

They have a lot of general knowledge, but don’t know everything (ex: proprietary information). We can leverage them as a reasoning engine to process retrieved information, rather than using them as a source of memorized information.

The main task of an LLM is to generate tokens (words). ****Tokens are words or work fragments, and are generally ~ 3/4 of a word. Given a phrase or sentence, the LLM generates the next token.

Some example applications include:

Screenshot 2023-12-30 at 11.57.58 PM.png