Howdy, Sarah here!
This is a ~10 minute summary of the 2.5 hour tutorial “Application Development Using Large Language Models” given by Andrew Ng and Open AI’s Isa Fulford at NeurIPS (Dec 11, 2023).
I recently attended with Metaphor and managed to get a spot at the talk haha. This document is based on the notes I took and polished with a good amount of restructuring, additions, and diagrams. It also includes the Prompt Engineering notes I shared a few days ago.
Additionally, a special thanks to contributors Swyx, Jerry Liu, Kudzo Ahegbebu and Brian Huang for their input, feedback, and comments.
look how crowded! every inch of floor space and walkway was filled 😱
While the talk is not available online, these comprehensive but quick notes hopefully provide a good, quick summary :) Enjoy!
Table of Contents:
LLMs are a reasoning engine.
They have a lot of general knowledge, but don’t know everything (ex: proprietary information). We can leverage them as a reasoning engine to process retrieved information, rather than using them as a source of memorized information.
The main task of an LLM is to generate tokens (words). ****Tokens are words or work fragments, and are generally ~ 3/4 of a word. Given a phrase or sentence, the LLM generates the next token.
Some example applications include: