During pretraining, large language models (LLMs) learn to statistically predict the next word in a sequence, which enables them to handle patterns like grammar and spelling but does not guarantee reliable answers to tricky factual questions.
high
process
Pretraining is the initial stage where models ingest large amounts of text and learn next-word prediction.