Textbooks Are All You Need
We introduce phi-1, a new large language model for code, with significantlysmaller size than competing models: phi-1 is a Transformer-based model with1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbookquality” data from the web (6B tokens) and synthetically generated t…