in

OpenAI is pushing for industry-specific AI benchmarks – why that matters

Getty Images/NurPhoto/Contributor

Benchmark performance results typically accompany the launch of every new AI model to showcase how well the models can perform on various tasks. However, these tasks are not catered to individual industries but are more general, such as grade school mathematics (GSM8K) or graduate-level reasoning (GPQA).

Also: ChatGPT will remember everything you tell it now – like a real personal assistant

OpenAI Pioneers Program

To fill that gap, OpenAI launched the OpenAI Pioneers Program, intended to advance AI model development for specific industries and real-world use cases. The program is a two-pronged effort in which companies will collaborate with OpenAI researchers to develop more domain-specific evaluations and fine-tuned models.

In the blog post, OpenAI shared that “industries like legal, finance, insurance, healthcare, accounting, and many others are missing a unified source of truth for model benchmarking.” As a result, OpenAI will now work with multiple companies across each industry to develop those evaluations, which are aimed not only at developing models but also at building better trust between the public and these systems.

Also: AI isn’t hitting a wall, it’s just getting too smart for benchmarks, says Anthropic

Research has highlighted this void of benchmarks as a major gap in AI for enterprise use cases. For example, Silvio Savarese, head of Salesforce AI Research, released a blog post on Enterprise General Intelligence (EGI), a concept he is pioneering that refers to more advanced AI solutions tailored to businesses’ domain-specific needs. In a conversation with ZDNET, he shared that one of the major steps needed to reach EGI is benchmarks that look at evaluating domain-specific functions.

Refining existing models

Beyond evaluations, OpenAI will also collaborate with the team to refine existing models for three industry-specific use cases using a technique known as reinforcement fine-tuning (RFT). The OpenAI team will help guide the companies on how to use RFT, and then the companies can decide how to deploy the models, which should be ready for large-scale deployment, according to OpenAI.

Also: The AI model race has suddenly gotten a lot closer, say Stanford scholars

The first cohort will consist of a handful of startups working on use cases that can “drive real-world impact.” If your company fits these criteria, you can apply by filling out the form with basic information about the company on the OpenAI Pioneers Program webpage.

Get the morning’s top stories in your inbox each day with our Tech Today newsletter.

–>


Source: Robotics - zdnet.com