Red Hat has updated Red Hat OpenShift AI, its cloud-based AI and machine learning platform, with a model registry with model versioning and tracking capabilities, data drift detection and bias detection tools, and LoRA (low-rank adaptation) fine-tuning capabilities. Stronger security also is offered, Red Hat said.
Version 2.15 of Red Hat OpenShift AI will be generally available in mid-November. Features highlighted in the release include:
- A model registry, currently in a technology preview state, that provides a structured way to share, version, deploy, and track models, metadata, and model artifacts.
- Data drift detection, to monitor changes in input data distributions for deployed ML models. This capability allows data scientists to detect when the live data used for model interference significantly deviates from the data upon which the model was trained. Drift detection helps verify model reliability.
- Bias detection tools to help data scientists and AI engineers monitor whether models are fair and unbiased. These predictive tools, from the TrustyAI open source community, also monitor models for fairness during real world deployments.
- Fine-tuning with with LoRA, to enable more efficient fine-tuning of LLMs (large language models) such as Llama 3. Organizations thus can scale AI workloads while reducing costs and resource consumption.
- Support for Nvidia NIM, a set of interface microservices to accelerate the delivery of generative AI applications.
- Support for AMD GPUs and access to an AMD ROCm workbench image for using AMD GPUs for model development.
Red Hat OpenShift AI also adds capabilities for serving generative AI models, including the vLLM serving runtime for KServe, a Kubernetes-based model inference platform. Also added is support for KServe Modelcars, which add Open Container Initiative (OCI) repositories as an option for storing and accessing model versions. Additionally, private/public route selection for endpoints in KServe enables organizations to enhance the security posture of a model by directing it specifically to internal endpoints when needed.