How Cerebras boosted Meta’s Llama to ‘frontier model’ performance
Cerebras used chain of thought at inference time to make a smaller AI model equal or better to a larger model. Cerebras SystemsCerebras Systems announced on Tuesday that it’s made Meta Platforms’s Llama perform as well in a small version as it does on a large version by adding the increasingly popular approach in generative artificial intelligence (AI) known as “chain of thought.” The AI computer maker announced the advance at the start of the annual NeurIPS conference on AI.”This is a closed-source only capability, but we wanted to bring this capability to the most popular ecosystem, which is Llama,” said James Wang, head of Cerebras’s product marketing effort, in an interview with ZDNET. The project is the latest in a line of open-source projects Cerebras has done to demonstrate the capabilities of its purpose-built AI computer, the “CS-3,” which it sells in competition with the status quo in AI — GPU chips from the customary vendors, Nvidia and AMD.Also: DeepSeek challenges OpenAI’s o1 in chain of thought – but it’s missing a few linksThe company was able to train the Llama 3.1 open-source AI model that uses only 70 billion parameters to reach the same accuracy or better accuracy on various benchmark tests as the much larger 405-billion parameter version of Llama. Those tests include the CRUX test of “complex reasoning tasks,” developed at MIT and Meta, and the LiveCodeBench for code generation challenges, developed at U.C. Berkeley, MIT and Cornell University, among others. Chain of thought can enable models using less training time, data, and computing power, to equal or surpass a large model’s performance. More