[YouTube Lecture Summary] Andrej Karpathy - Deep Dive into LLMs like ChatGPT

Introduction

Pre-Training

Step 1: Download and preprocess the internet

Step 2: Tokenization

Step 3: Neural network training

Step 4: Inference

Base model

Post-Training: Supervised Finetuning

Conversations

Hallucinations

Knowledge of Self

Models need tokens to think

Things the model cannot do well

Post-Training: Reinforcement Learning

Reinforcement learning

DeepSeek-R1

AlphaGo

Reinforcement learning from human feedback (RLHF)

Preview of things to come

Keeping track of LLMs

Where to find LLMs

Knowledge of Self

1. LLM has no self-awareness

  • LLM (Large Language Model) is a system without memory or ego .

  • When a conversation ends, all information is deleted, and the next conversation starts from a completely new state.

  • That is, it is not an entity that is aware of itself or exists continuously.


2. The way LLM introduces itself is a simple pattern learning result.

  • Answers to questions like “Who are you?” are not because the model recognizes itself, but because it probabilistically generates the most appropriate sentence from the data it has learned .

  • For example, there is a lot of information about OpenAI and ChatGPT on the Internet, so the model might answer something like "I am ChatGPT developed by OpenAI."

  • However, this is only the most frequently appearing pattern in the learned data and is not always accurate information .


3. How to establish the identity of LLM

There are two ways to make a model assume a specific identity.

1) Fine-Tuning Training Data

  • If you teach the model the desired answer to a specific question (e.g., "Who are you?"), the model will follow that answer.

  • Example: "I am an Almo model developed by Allen AI."


2) Insert System Message

  • By inserting hidden system messages at the beginning of a conversation , you can make the model reference specific information.

  • Example: "You are ChatGPT 4.0 developed by OpenAI, and your knowledge cutoff is 2024."

  • The user cannot see this message, but the model uses it to communicate.