[YouTube Lecture Summary] Andrej Karpathy - Deep Dive into LLMs like ChatGPT

Introduction

Pre-Training

Step 1: Download and preprocess the internet

Step 2: Tokenization

Step 3: Neural network training

Step 4: Inference

Base model

Post-Training: Supervised Finetuning

Conversations

Hallucinations

Knowledge of Self

Models need tokens to think

Things the model cannot do well

Post-Training: Reinforcement Learning

Reinforcement learning

DeepSeek-R1

AlphaGo

Reinforcement learning from human feedback (RLHF)

Preview of things to come

Keeping track of LLMs

Where to find LLMs

DeepSeek-R1

1. Importance and significance of DeepSeek-R1

  • 🔍 Key Points

    • OpenAI does not disclose its LLM training method and model based on reinforcement learning (RL) .

    • On the other hand, DeepSeek has released a model (DR1) that applies reinforcement learning as open source , allowing researchers to utilize it directly.

    • This marks a turning point in accelerating reinforcement learning-based LLM research in the AI ​​research community .


2. Practical application and experimental results of Deepseek R1

  • 📈 Changes after applying reinforcement learning

    • Significantly improve your math problem-solving skills

    • Approached in various ways, accuracy increases gradually

    • The model forms its own inference process

      • 🧐 "Wait a minute, let me check again."

      • 🤔 "Let's try another way to verify that this approach is correct."

      • ✅ "Now you can be sure of the answer!"

  • 🤯 The key is for the model to learn human-like thought processes and naturally develop problem-solving strategies !


3. How to use DeepSeek R1

  • 💻 Released as an open source model

    • Direct download and executable (⚠️ High-performance hardware required)

    • ☁️ Cloud services available

      • DeepSeek Official Website

      • DeepSearch R1 can be run on Together.ai

    • 🔬 Google's Gemini 2.0 Flash (Thinking Experimental) model also offers similar features


4. How to use it in real life

  • 🎯 Which model should I use in which situation?

    • 📚 General Knowledge Questions: Using Existing LLMs (⚡ Quick Answers)

    • 🧠 Problems requiring math and logical thinking: Use reasoning models (📈 High accuracy)