[YouTube Lecture Summary] Andrej Karpathy - Deep Dive into LLMs like ChatGPT

Introduction

Pre-Training

Step 1: Download and preprocess the internet

Step 2: Tokenization

Step 3: Neural network training

Step 4: Inference

Base model

Post-Training: Supervised Finetuning

Conversations

Hallucinations

Knowledge of Self

Models need tokens to think

Things the model cannot do well

Post-Training: Reinforcement Learning

Reinforcement learning

DeepSeek-R1

AlphaGo

Reinforcement learning from human feedback (RLHF)

Preview of things to come

Keeping track of LLMs

Where to find LLMs

Things the model cannot do well

1. Counting Problem 🔢

  • Problem: Often times the number is not counted accurately.

  • Reason: Because information is processed in token units rather than individual characters .

  • Example: If you list several dots ( .) and ask how many they are, it will predict the wrong number .

  • Solution: You can use the Run Python code feature to calculate the exact number.


2. Spelling Recognition Errors ✍️

  • Problem: Vulnerable to tasks that require recognizing or manipulating individual characters.

  • Reason: The model stores and processes words as tokens rather than characters .

  • Example 1: Unable to solve the problem of printing "ubiquitous"every third letter .

  • Example 2: When asked how many are "strawberry"contained in , the model incorrectly answers "2" for a while .'r'

    • This issue once went viral, with many people citing it as an example of the limitations of AI.

    • Reason: "strawberry"Because it recognized the entire word as one token and could not analyze individual characters.

  • How to fix:

    • Manipulating strings with Python code can produce accurate results.

    • Spell checking and character counting are better approached programmatically rather than through AI models.


3. Simple logic operation errors ❌

  • Problem: Even simple number comparisons (e.g. 9.11 > 9.9) can give wrong answers.

  • reason:

    • Certain numbers (e.g. 9.11) may be recognized as Bible verses .

    • Errors occur when numbers are interpreted as contextual patterns rather than as simple mathematical operations.

  • Example: 9.11 > 9.9 When you ask a question, sometimes you get a logically incorrect answer.


Conclusion ✅

✔️ Language models can show weaknesses in counting numbers, recognizing spelling, and logical operations
. ✔️ Understanding these limitations and utilizing complementary methods such as code execution can improve accuracy.
✔️ Use models as tools, but make it a habit to always verify them for important problems. 🔍