Microsoft Releases 1.3 Bn Parameter Language Model, Outperforms LLaMa
Microsoft Research has upped the game with an even smaller model. phi-1 is a transformer based model with just 1.3 billion parameters.
I wonder if higher quality datasets are the future rather than using tons of internet scraped texts. Either way, neat model!
Bad article title. This is the "Textbooks are all you need" paper from a few days ago. It's programming focused and I think Python only. For general purpose LLM use, LLaMA is still better.