Red Hat Unveils AI 3 to Power Distributed Inference, Agentic AI at Scale

Red Hat has unveiled Red Hat AI 3, the latest version of its hybrid cloud-native AI platform, designed to simplify and scale production-grade AI inference across enterprise environments.

According to the official statement, the release brings together innovations from Red Hat AI Inference Server, Red Hat Enterprise Linux AI (RHEL AI), and Red Hat OpenShift AI, marking a major step toward operationalising next-generation agentic AI at scale.

As enterprises push AI workloads from experimentation to production, they face mounting challenges related to data privacy, infrastructure costs, and model management.

Red Hat AI 3 provides a unified, open, and scalable platform that supports any model on any hardware, from data centres to sovereign AI environments and edge deployments.

The platform introduces advanced distributed inference capabilities through llm-d, now generally available with Red Hat OpenShift AI 3.

It offers intelligent model scheduling, disaggregated serving, and cross-platform flexibility across NVIDIA and AMD hardware accelerators, enhancing both performance and cost efficiency for enterprise-scale LLM workloads.

Red Hat AI 3 also introduces a unified environment for collaboration between IT and AI teams, the company said.

New Model-as-a-Service (MaaS) capabilities allow organisations to centrally serve and manage models for internal use, improving cost control and data privacy.

The AI Hub provides a curated model catalog and lifecycle management tools, while the Gen AI Studio offers an interactive workspace for AI engineers to experiment, prototype, and fine-tune generative AI applications with integrated evaluation and monitoring.

The platform includes several optimised open-source models, such as OpenAI’s gpt-oss, DeepSeek-R1, Whisper, and Voxtral Mini, to help developers accelerate development of chat, voice, and retrieval-augmented generation (RAG) applications.

Beyond inference, Red Hat AI 3 sets the stage for autonomous, task-oriented agentic AI systems that represent the evolution of enterprise AI.

The new release includes a Unified API layer based on the Llama Stack for OpenAI-compatible model interfaces and early adoption of the Model Context Protocol (MCP) to improve interoperability between models and external tools.

A new modular toolkit, extending Red Hat’s InstructLab functionality, gives developers greater flexibility for model customisation, data ingestion, and fine-tuning using open-source libraries such as Docling.

Joe Fernandes, vice-president and general manager of Red Hat’s AI business unit, said the company aims to help enterprises overcome the complexity and cost barriers of operationalising AI.

“By bringing new capabilities like distributed inference with llm-d and a foundation for agentic AI, we are enabling IT teams to confidently operationalise next-generation AI, on their own terms, across any infrastructure,” he said in the statement.

The post Red Hat Unveils AI 3 to Power Distributed Inference, Agentic AI at Scale appeared first on Analytics India Magazine.

Related Posts