DataCentreNews India - Specialist news for cloud & data centre decision-makers
Flux result 2f81fbce 97f3 47f2 ba0a f5e66499eb55

Virtana launches AI observability for Nutanix environments

Fri, 10th Apr 2026

Virtana has launched AI Factory Observability for Nutanix environments, extending monitoring across Nutanix Cloud Infrastructure and Nutanix Enterprise AI.

The software is designed to give infrastructure and platform teams a single operational view of AI workloads and the systems that support them. It links token-level service data with GPU usage, storage, orchestration, and broader infrastructure behaviour in Nutanix deployments.

The launch comes as companies shift from building AI models to running larger production systems that depend on distributed resources and high levels of concurrency. Virtana cited research showing that 75% of enterprises report double-digit AI job failure rates, while more than half attribute them to infrastructure bottlenecks that reduce throughput, raise cost per token, and limit the number of concurrent agent workloads they can support.

Operational Focus

Virtana is positioning the product around the operational demands of agentic AI, where software agents continuously adjust how they use compute and other infrastructure. In Nutanix environments, teams need visibility not only into the infrastructure layer but also into the AI services and workflows running above it.

The observability platform spans Nutanix AHV, Nutanix Enterprise AI, Kubernetes orchestration, Nvidia GPU clusters, and distributed AI workflows. It is intended to help teams see how workload behaviour affects infrastructure consumption and where contention, inefficiency, or reliability issues emerge.

Virtana said key functions include real-time GPU telemetry covering utilisation, memory, power draw, temperature, and health across distributed clusters. The system can also detect idle and underused GPUs, correlate workloads with GPU consumption across training and inference tasks, and provide token-level visibility into throughput, latency, and resource demand.

Another focus is risk detection. According to Virtana, the platform can identify thermal, power, and reliability issues before they affect production AI services, while also analysing performance in multi-node and multi-GPU environments that support agentic workloads.

Luke Congdon, Vice President of Product Management at Nutanix, said stronger visibility is becoming more important as AI systems grow more complex in production.

"As enterprises adopt the Nutanix Agentic AI platform to build and run intelligent, distributed AI systems, understanding how those workloads behave across infrastructure and services becomes critical," he said. "Virtana's extension of observability into Nutanix Enterprise AI helps provide that visibility, enabling organizations to operate AI factories with greater performance, efficiency, and control."

Production Pressure

Paul Appleby, Chief Executive Officer of Virtana, said the main challenge for many companies is no longer initial deployment.

"Enterprises have proven they can stand up AI infrastructure," he said. "The challenge now is operating agentic AI environments where systems reason, adapt, and act across distributed resources. These are dynamic systems that demand full-stack visibility and control to optimize GPU utilization, manage cost efficiency, and support thousands of concurrent agents with the performance and governance required for production at scale."

Virtana argues that Nutanix Enterprise AI now occupies an important layer in these environments because it is where models, agent services, and enterprise AI workflows are deployed and scaled. That makes it a key point for observing inference performance, GPU consumption, infrastructure contention, and system reliability together rather than through separate tools.

Amitkumar Rathi, Chief Product Officer at Virtana, said that shift requires deeper operational visibility.

"AI workloads are no longer static. They are increasingly agentic, continuously adapting how they consume infrastructure," he said. "By extending AI Factory Observability into Nutanix Enterprise AI, we give organizations end-to-end visibility and control across the layer where AI services are built and operated, while connecting that activity back to the infrastructure supporting it. Platform teams can manage performance, reliability, and cost with greater precision, and data teams gain the operational context required to run AI in production with confidence."

The launch is another sign that suppliers in the enterprise AI market are focusing more on operational challenges, not just model development, as companies seek to reduce failed jobs, improve GPU utilisation, and keep production costs under closer control.