What Is an LLM?
A Large Language Model (LLM) is an AI model trained on billions of pages of text to understand and generate human language. 'Large' refers to model size — measured in billions of parameters (numbers learned during training).
Parameters are the model's 'memory.' GPT-4 is estimated to have ~1 trillion parameters. Each parameter is a small number that together stores knowledge about language, facts, and reasoning.
## How It Works: Two Stages
Stage 1 – Pre-training: The model reads billions of text pages from the internet, books, and code. Simple task: 'predict the next word.' From trillions of repetitions, the model learns grammar, facts, logic, and writing nuances.
Stage 2 – Fine-tuning & RLHF: The model is refined with high-quality conversation examples and optimized using human feedback (Reinforcement Learning from Human Feedback) to be more helpful, honest, and safe.
## Why Does an LLM Seem 'Smart'?
LLMs' surprising capabilities emerge from hidden patterns in massive training data. You never taught the LLM 'if there is fire, use water' — but because it has read thousands of texts about fires and problem-solving, it can draw this conclusion itself.
## Important Limitations
- Hallucination: LLMs can produce false facts with high confidence.
- Knowledge cutoff: Training data has a cutoff date.
- No true understanding: LLMs match patterns; they do not understand the world like humans.
- Bias: Reflects biases in training data.