Fine-tuning and posttraining methods, including human feedback, are used to steer pretrained LLMs toward safer and more accurate behavior by adjusting their outputs based on additional training objectives.
high
process
Posttraining refers to later-stage training such as reinforcement learning from human feedback and other fine-tuning techniques.