Introduction
Pre-Training
Step 1: Download and preprocess the internet
Step 2: Tokenization
Step 3: Neural network training
Step 4: Inference
Base model
Post-Training: Supervised Finetuning
Conversations
Hallucinations
Knowledge of Self
Models need tokens to think
Things the model cannot do well
Post-Training: Reinforcement Learning
Reinforcement learning
DeepSeek-R1
AlphaGo
Reinforcement learning from human feedback (RLHF)
Preview of things to come
Keeping track of LLMs
Where to find LLMs
✔ AI models think by generating words (tokens) sequentially from left to right
. ✔ The amount of calculations that can be processed in one token generation is limited.
✔ In other words, if you try to solve a complex problem at once, the accuracy is likely to decrease .
💡 Problem :
Emily bought 3 apples and 2 oranges. Each orange costs $2, and the total cost is $13. What is the cost of one apple?
"The answer is 3"
🔴 Reason:
The model must perform all calculations at once, resulting in computational overhead.
The likelihood of getting the wrong answer increases in complex problems.
"The price of two oranges is $4. Subtract $4 from the total price and you get $9. Since the price of three apples is $3 for one apple."
🟢 Reason:
Learning effectiveness is improved by encouraging the model to think step by step .
A way for models to help solve complex problems logically.
🚨 The model cannot perform too many calculations in one operation (token prediction) .
📉 As the number gets larger, the possibility of a wrong answer increases.
✔ Solution
Prompts you to generate answers that include a step-by-step calculation process
Guide to a logical approach, including intermediate results
🤖 The computational power of AI models is limited, but their ability to write code is excellent.
💡 More accurate calculations are possible by utilizing programming languages such as Python.
✔ Example :
"Write a Python code that calculates the price of an apple"
➡ The model can run price = (13 - 2*2) / 3
the same code and come up with the correct answer .
📌 Conclusion :
Accuracy increases when the model uses the code execution feature instead of calculating it directly.
For complex computational problems, it is recommended to actively utilize Python code execution.