All News

AMD Introduces Open Rack-Scale AI Infrastructure and MI350 GPUs

At its Advancing AI event, AMD unveiled an open, end-to-end rack-scale AI platform vision anchored by the new Instinct MI350 Series accelerators. The MI350X and MI355X GPUs deliver 4× compute and 35× inference improvements, surpass 38× energy efficiency gains, and integrate with fifth-gen Epyc CPUs and Pensando NICs. AMD also previewed future Helios racks and highlighted ROCm 7 and partner deployments across hyperscalers.

Published June 14, 2025 at 04:13 AM EDT in Artificial Intelligence (AI)

AMD Unveils End-to-End AI Platform Vision

At its Advancing AI event, AMD rolled out a vision for an open, scalable, rack-scale AI infrastructure built on industry standards. CEO Lisa Su unveiled the new Instinct MI350 Series accelerators, boasting a fourfold AI compute boost and a 35-fold leap in inference performance over previous generations.

Instinct MI350 Series Delivers 4× Compute, 35× Inference

The MI350 lineup—comprising MI350X and MI355X GPUs—sets a new benchmark for generative AI and high-performance computing (HPC). AMD reports a 38× energy efficiency gain for training nodes, surpassing its original five-year goal to improve efficiency by 30×.

  • 4× generation-on-generation AI compute increase
  • 35× generational leap in inference performance
  • 38× improvement in AI training energy efficiency

Open Rack-Scale Architecture Beyond 2027

AMD’s open rack-scale designs integrate fifth-gen Epyc CPUs, Pensando NICs and Instinct GPUs. These systems are already in hyperscalers like Oracle Cloud Infrastructure, with broad availability slated for the second half of 2025.

  • 5th Gen AMD Epyc Venice CPUs
  • AMD Instinct MI350 Series GPUs
  • AMD Pensando Pollara and Vulcano NICs

Looking ahead, AMD previewed Helios: a full rack powered by Zen6 Epyc Venice CPUs, next-gen MI400 Series GPUs and Vulcano NICs, targeting inference-centric cloud and on-prem deployments.

Expanding Open Ecosystem with ROCm 7

ROCm 7 enhances the open-source software stack with broader hardware support, optimized drivers and new APIs. The update aims to simplify developer workflows and accelerate AI model development across industry-standard frameworks.

  • Improved support for industry frameworks
  • Expanded hardware and OS compatibility
  • New development tools, drivers and APIs

Broad Adoption Across Industry Leaders

AMD’s ecosystem partners showcased live deployments: Meta runs Llama models on MI300X, OCI is building zettascale clusters with MI355X, and Microsoft powers proprietary and open LLMs on Azure. Solutions by Cohere, Humain, Red Hat and others highlight flexible, TCO-conscious AI infrastructure.

  • Meta uses Instinct GPUs for Llama 3 and Llama 4 inference
  • Oracle Cloud Infrastructure zettascale AI clusters
  • Azure runs mixed proprietary and open LLMs
  • Cohere powers enterprise-grade inference
  • Red Hat OpenShift AI on Instinct GPUs

As AI scales from training to inference, openness and collaboration will define the next chapter. QuarkyByte helps organizations evaluate these rack-scale solutions, analyze performance trade-offs and architect efficient, future-proof AI infrastructure tailored to real-world workloads.

Keep Reading

View All
The Future of Business is AI

AI Tools Built for Agencies That Move Fast.

Discover how QuarkyByte’s AI infrastructure analysis can help you compare rack-scale solutions like AMD’s MI350 platform and map performance gains to your workloads. Engage with our experts to model energy efficiency improvements and future-proof your AI deployments with data-driven insights. Schedule a consultation.