NVIDIA Unveils Vera-CPU Rack at GTC 2026: New Benchmark for CPU-Only AI Infrastructure

Widely Covered
At GTC 2026, NVIDIA unveiled a groundbreaking new system: the Vera-CPU Rack, designed specifically for CPU-only inference in AI applications. The rack integrates 256 Vera CPUs, each built on a custom Arm v9.2-A core named Olympus, marking a strategic shift from licensed to in-house CPU architectures. Each Vera CPU features 88 cores with Simultaneous Multithreading (SMT), totaling 22,528 processor cores and 45,056 threads across the entire rack. The chips support FP8 precision, crucial for efficient AI workloads, and offer up to 1.5 TB of RAM per CPU, along with a scalable coherency fabric for optimized inter-processor communication.

The Vera-CPU Rack is equipped with 400 TB of LPDDR memory, representing a significant leap over traditional x86 systems. The total memory bandwidth of the rack reaches an impressive 300 TB/s, with each individual Vera chip achieving 1.2 TB/s—three times that of comparable x86 processors. The integration of BlueField-4 DPUs and an 800 GbE connection ensures high-performance, secure networking, essential for large-scale AI infrastructures. Built on a chiplet architecture, the design maximizes scalability and flexibility in datacenter environments.

The new CPU solution is specifically targeted at applications such as agentic workloads, reinforcement learning, and AI training, where high single-thread performance and low latency are critical. NVIDIA emphasizes that the Vera CPU can be deployed both in conjunction with Rubin GPUs and as a standalone solution, making the platform adaptable to diverse use cases. The introduction of the Vera-CPU Rack marks another step toward positioning NVIDIA as a comprehensive provider in the broader server CPU market—not just for GPU-based, but also for CPU-dominated AI architectures.

Development is being advanced in close collaboration with partners like HPE, who are further developing the system with the Cray Supercomputing GXC240. This model is designed to support up to 640 Vera CPUs per rack, significantly enhancing the system’s scalability and performance. The Vera CPU itself is scheduled for market launch in the second half of 2026. With this announcement, NVIDIA underscores its strategic intent to blur the lines between GPU- and CPU-based AI systems and deliver a unified, high-performance infrastructure for the next generation of AI applications.