News stories tagged with #Inference
Nvidia Integrates Groq 3 LPU into Vera-Rubin Platform: A New Era of Low-Latency AI Inference Begins
At GTC 2026, Nvidia announced the integration of Groq’s 3rd-generation Language Processing Unit (LPU) into its new Vera-Rubin-NVL72 platform to dramatically boost AI inference throughput with ultra-low latency. Designed specifically for inference workloads, the LPU leverages high SRAM and internal bandwidth for rapid token processing. This technology complements Nvidia’s existing GPU ecosystem and is deployed in new LPX racks. Partners such as HPE and Giga Computing showcased next-generation AI factories and high-performance computing infrastructure at the event, built around these advancements.
AMD and Nutanix Partner for Open AI Infrastructure While Speculation Grows Around RDNA-5 GPU 'AT0'
AMD and Nutanix have formed a strategic partnership to develop an open, scalable AI infrastructure for enterprises, leveraging AMD EPYC CPUs, AMD Instinct GPUs, and Nutanix cloud and Kubernetes platforms. The collaboration includes a $150 million investment by AMD in Nutanix, aiming to deliver production-ready AI solutions for agentic applications across data centers, hybrid, and edge environments. Meanwhile, speculation continues about a potential AMD RDNA-5 chip named 'AT0,' which could be released as a limited gaming GPU similar to the Radeon VII, though AMD has not confirmed any details and a market launch is likely not before 2027.