Elyse Betters Picaro / ZDNETFollowing the recent launch of a new family of GPT-4.1 models, OpenAI released o3 and o4-mini on Wednesday, the latest addition to its existing line of reasoning models. The o3 model, previewed in December, is OpenAI’s most advanced reasoning model to date, while o4-mini is a smaller, cheaper, and faster model.Meet o3 and o4-mini Simply put, reasoning models are trained to “think before they speak,” meaning they take more time to process the prompt but provide higher-quality responses. As a result, like older models, o3 and o4-mini show strong and even improved performance in coding, math, and science tasks. However, they also have an important new addition: visual understanding. Also: How to use ChatGPT: A beginner’s guide to the most popular AI chatbotOpenAI o3 and o4-mini are OpenAI’s first models to “think with images.” OpenAI explains that this means the models don’t just see an image; they can actually use the visual information in their reasoning process. Users can also now upload images that are low quality or blurry, and the model will still be able to understand them. Another major first is that o3 and o4-mini can independently or agentically use all ChatGPT tools, including web browsing, Python, image understanding, and image generation, to better resolve complex, multi-step problems. OpenAI says this ability allows the new models to take “a step toward a more agentic ChatGPT that can independently execute tasks on your behalf.”Also: The top 20 AI tools of 2025 – and the #1 thing to remember when you use themIn the livestream launching the models, the team explained that the same way a person may use a calculator to deliver better results, the new models can now employ all of OpenAI’s advanced tools to deliver better results. For example, in a demo, the researcher fed a scientific research poster to o3. He prompted it to analyze the image and draw a conclusion that wasn’t included in the poster. To get the answer, o3 knew to browse the internet and zoomed into the image’s different elements to generate a conclusive answer, showcasing both its ability to use multiple tools on its own and to analyze images in detail. More