On the last day of OpenAI’s 12 days of ‘shipmas,’ the company unveiled its latest models, o3 and o3-mini, which excel at reasoning and even outperform o1 on a series of benchmarks, including math and science. At launch, OpenAI CEO Sam Altman said o3 was slated to drop at the end of January, and today, the company made good on its promise.
o3-mini
On Friday, OpenAI released its o3-mini model, the most cost-efficient model in OpenAI’s reasoning series, to the public. Until now, that series has been comprised of o1 and o1-mini. Like its predecessor, the model is particularly strong in science, math, and coding, according to the company.
OpenAI o3-mini is now available in ChatGPT and the API.
Pro users will have unlimited access to o3-mini and Plus & Team users will have triple the rate limits (vs o1-mini).
Free users can try o3-mini in ChatGPT by selecting the Reason button under the message composer.— OpenAI (@OpenAI) January 31, 2025
When o3-mini is selected, it will use medium reasoning effort, which balances speed and accuracy. While the original o1 model still has broader general knowledge than o3-mini, the new model’s major advantage is its faster speed and higher performance compared to o1-mini.
Benchmark performance
When comparing the performance of o3-mini to o1-mini, expert testers found that o3-mini delivered more accurate, reasoned-through, and clearer responses than o1-mini. According to the post, they preferred o3-mini responses 56% of the time and observed a 39% reduction in major errors.
Beyond human preference evaluations, in several STEM benchmarks, including the Competition Math (AIME 2024), PhD-level Science Questions (GPQA Diamond), and Competition Code (Codeforces), o3-mini with medium reasoning — which is what ChatGPT users will get by default — outperformed o1-mini.
–>