[YouTube Lecture Summary] Andrej Karpathy - Deep Dive into LLMs like ChatGPT

Introduction

Pre-Training

Step 1: Download and preprocess the internet

Step 2: Tokenization

Step 3: Neural network training

Step 4: Inference

Base model

Post-Training: Supervised Finetuning

Conversations

Hallucinations

Knowledge of Self

Models need tokens to think

Things the model cannot do well

Post-Training: Reinforcement Learning

Reinforcement learning

DeepSeek-R1

AlphaGo

Reinforcement learning from human feedback (RLHF)

Preview of things to come

Keeping track of LLMs

Where to find LLMs

Preview of things to come

🚀 Future AI model development direction and prospects 🔍

AI models are evolving rapidly, and the changes we can expect in the future are summarized below.


🔥 1. Multimodal AI

Currently, most AI models can only process text , but in the future, models that can naturally handle audio (voice), images (visual), and video (video) will appear. 🎙️📸🎥

👉 How is it possible?

  • Speech can be tokenized using spectrograms (a visual representation of the acoustic signal) . 🎵

  • Images can be tokenized by breaking them into several small patches (slices) . 🖼️

  • Ultimately, text, voice, images, etc. can all be converted into tokens , and language models can process them. ✅

These changes will allow for more natural and intuitive communication with AI. 🤖💬


🏗️ 2. AI capable of performing long-term tasks (Agent AI)

Current AI models only provide answers to short-term questions , but in the future, AI agents that perform multiple tasks over long periods of time are expected to emerge.

👉 Expected changes

  • AI will emerge that can combine and execute multiple tasks on its own .

  • You will be able to continue working while detecting and correcting errors .

  • Humans will play a role in supervising AI and intervening when necessary .

These developments will allow AI to move beyond being a simple information provider to a digital assistant that actually does the work . 🛠️🤖


🕵️‍♂️ 3. Invisible AI

In the future, AI is expected to be naturally integrated into various tools in our daily lives rather than specific applications .

✔️ AI will be able to perform functions that replace keyboard and mouse operations .
✔️ Systems will be developed that learn users' habits and perform automated tasks
✔️ AI functions will be naturally embedded in various software.

For example, there is a high possibility that an era will come when AI directly controls the user's computer and performs tasks, like the Operator function of ChatGPT . 💻🖱️


🧠 4. AI capable of real-time learning (Test-Time Training)

Current AI models no longer learn after training is complete .
That is, the model itself does not change when it receives new information; it simply generates output based on the input.

💡 But what about the future?

  • AI will be able to learn in real time based on user experience .

  • It is possible that the ability to acquire and update new information like humans will be added.

  • In situations where long contexts must be processed, more efficient solutions than existing approaches will be required.

Current AI can only process information within a certain context (window) , but if long-term memory and learning capabilities are added, more advanced forms of AI will emerge. 🚀