in

Nvidia announces raft of ‘NIMs’ to speed up Gen AI apps

Nvidia hero graphic for its NIMs AI microservices.

Nvidia

Chip giant Nvidia on Monday announced, at the kick-off of the annual SIGGRAPH computer graphics conference, a raft of offerings for its “NIM” container software to address numerous functions and industries, and a cloud service for running the programs, called “Inference-as-a-service,” in partnership with developer tools site Hugging Face.

The new NIMs include pre-packaged software for running AI models that are optimized for use as “copilots” and to interact with retrieval-augmented generation, or “RAG,” infrastructure, an increasingly popular way to hook up large language models to external databases and applications.

From a couple dozen NIMs a year ago, Nvidia has expanded the roster to over a hundred NIMs, across different industries and use cases, Kari Briski, Nvidia’s vice president of generative AI software product management, said in a press briefing.

Also: Nvidia teases Rubin GPUs and CPUs to succeed Blackwell in 2026

The NIMs package up “the best curated models from model partners” including Google, Meta, Microsoft, and Snowflake, and open-source models “with appropriate licenses for commercial production.”

(An “AI model” is the part of an AI program that contains numerous neural net parameters and activation functions that are the key elements for how an AI program functions.)

<!–>

Nvidia’s new NIMs include those that run open-source AI models such as Meta’s Llama 3.1 language model, introduced last week, and Mistral NeMo 12B, jointly developed with Nvidia and the French AI company Mistral AI. There’s also a new NIM that is meant to “bring to life” chatbots by adding speech and is built for the Parakeet model for automatic speech recognition, developed by Nvidia and the AI startup Suno.ai.

Other NIMs focus on robotics and digital biology. Nvidia also has new NIMs focused on 3D product development, for use with “open universal scene description,” or, Open USD, a standard for translating between different 3D simulation environments. Nvidia has been building Open USD in conjunction with other industry giants, including Apple.

Nvidia also upgraded a NIM oriented to graphics rendering, the Edify NIM, in conjunction with stock photography provider Getty Images. That upgrade greatly improves the rendering speed of Getty’s generative AI image making

Also: Both of Getty’s commercial-safe AI image generators just got smarter and faster 

In a related note, Nvidia announced the availability of the first version of its “Omniverse” SDK for Apple’s Vision Pro headset. Omniverse is Nvidia’s version of the metaverse, focused on productivity tasks such as sharing large 3D models between teams building products.

NIM, an acronym for Nvidia Inference Microservices, is a software infrastructure part of Nvidia’s AI Enterprise software offering, first introduced in January 2023. A NIM is an AI model in an application container that runs on a container manager, such as Kubernetes, and that is accessed by developers via an API. As a microservice version of AI models, it is meant to be easily “dropped into” applications.

Also: I broke Meta’s Llama 3.1 405B with one question (which GPT-4o gets right)

Meanwhile, the Hugging Face inference service, which runs on Nvidia’s own DGX Cloud, an infrastructure service, provides a dramatic boost to performance, Briski said. For example, the new Llama 3.1 70B model, introduced last week by Meta, can perform inference operations up to five times faster than when run on “off-the-shelf” hardware, Nvidia said.

Although Hugging Face offers about 750,000 models, Briski said, the inference-as-a-service is limited for the moment to those models that Nvidia “NIM-ified,” as the company puts it, meaning, turned into a NIM.  

NIMs can be run outside the Hugging Face service, in any environment the client desires, including on-prem, Briski said, as long as they have access to GPUs, and as long as they subscribe to the Nvidia Enterprise subscription offering. The Nvidia Enterprise subscription costs $4,500 per GPU per year.

Get a Sam’s Club membership for $20 right now – here’s how

My favorite USB-C accessory of all time just got a magnetic upgrade (and it’s 30% off)