LIMA: Less Is More for Alignment
Large language models are trained in two stages: (1) unsupervised pretrainingfrom raw text, to learn general-purpose representations, and (2) large scaleinstruction tuning and reinforcement learning, to better align to end tasks anduser preferences. We measure the relative importance of these two…