When first released, the joint platform allowed enterprises to deploy generative AI applications and included a vector database so that companies could use retrieval augmented generation, or RAG, to make their generative AI give more accurate and up-to-date answers.
“The piece that we were missing was a model store manager,” says Paul Turner, vice president of products in the VMware Cloud Foundation division at Broadcom.
It allows enterprises to make a curated selection of AI models available to their developers, along with access controls to those models.
“And it makes sure that nobody’s using just general-purpose large language models that you don’t want to support,” Turner says. “Because out there on the Internet, you don’t know the provenance of that LLM and where it’s coming from. This gives you a way to manage those LLMs across your user base so that you can truly let them turn on their generative AI innovation.”
VMware customers can use Nvidia’s AI models, as well as models from Hugging Face and other partners, including Meta’s Llama 3 model and models from Google and Mistral. “Whatever Nvidia supports, we support,” Turner says.
In addition to a model store, other new capabilities include tools to secure the models with integrated access controls, a streamlined deployment workflow, and reference AI workflows for specialized use cases like customer service, drug discovery, and PDF data extraction.