Introduction
Pre-Training
Step 1: Download and preprocess the internet
Step 2: Tokenization
Step 3: Neural network training
Step 4: Inference
Base model
Post-Training: Supervised Finetuning
Conversations
Hallucinations
Knowledge of Self
Models need tokens to think
Things the model cannot do well
Post-Training: Reinforcement Learning
Reinforcement learning
DeepSeek-R1
AlphaGo
Reinforcement learning from human feedback (RLHF)
Preview of things to come
Keeping track of LLMs
Where to find LLMs
"The base model is a large-scale language model trained on large amounts of text data, without any specific human feedback, that probabilistically predicts the next word or sentence in a given context."
Trained without supervised learning
No human data labeling
Learning based on large amounts of text data collected from the Internet, books, papers, etc.
A model trained without a purpose
Not trained to answer specific questions or perform specific tasks
A model that simply looks at the given context and predicts the next word (token).
A simple "token predictor"
Generates the next word (token) probabilistically based on the pattern of the input text.
Predicted sentences are generated based on statistical similarity to the training data.
Not rule-based
Rather than learning linguistic rules directly, it follows probabilistic patterns .
It has no logical thinking or intention, and is at the level of reproducing patterns seen in training data.
It is just a simple "sentence predictor" , without any logical thinking or question-answering ability.
Since it only predicts based on what it has learned from the Internet, it may generate untrained information and not be true (📌 "hallucination" problem )
Not having the ability to perform specific tasks (e.g., translating, summarizing, coding)
“Base model” itself is not practical AI → Additional post-processing learning (post-training) is required to interact with humans.
📌 In other words, the Base model is a "probabilistic reproducer of Internet documents," not a conversational AI or a task-specific performance model.
📌 Postprocessing is absolutely necessary to make the Base model into a practical AI.
Well-designed prompts can accomplish specific tasks.
Example input :
apple → 사과
banana → 바나나
teacher → ?
By learning the above pattern, the model is automatically guided to predict "teacher".
You can act like AI by entering a conversation script in the form of a web page.
Human: Hi! Who are you?
AI: Hello! I am an AI assistant. I can answer questions, provide explanations, and assist with various tasks. How can I help you?
Human: Can you explain how photosynthesis works?
AI: Sure! Photosynthesis is the process by which plants, algae, and some bacteria convert light energy into chemical energy.
The basic chemical equation is:
6CO₂ + 6H₂O + light energy → C₆H₁₂O₆ + 6O₂
This process provides oxygen for us to breathe and serves as an essential energy source for most ecosystems. Would you like a more detailed explanation?
Human: No, that's fine. Instead, can you help me write a short email?
AI: Of course! Who is the recipient, and what should the email include?
Human: I want to write an email to my manager requesting a day off next week.
AI: Here's a draft you can use:
Subject: Request for Day Off Next Week
Dear [Manager’s Name],
I hope this message finds you well. I am writing to request a day off on [specific date]. Please let me know if this works for the team’s schedule or if there are any concerns.
Thank you for considering my request. Please let me know if you need further details.
Best regards,
[Your Name]
Does this draft work for you, or would you like to make any changes?
Human:
If you add the desired prompt here, you can use it like a conversational AI.
📌 However, since the Base model is not a chatbot in nature, it cannot become a perfect conversational AI without post-processing.
Base model = Probabilistic predictor of Internet sentences
Simple auto-completion engine , unable to understand questions or think logically.
Post-Training is essential for practical AI
Limited usability with prompt engineering
➡ The Base model is the first step of AI, and it needs to be improved before it can be used as practical AI. 🚀