- HPE will ship 72-GPU racks with next-generation AMD Instinct accelerators globally
- Venice CPUs combined with GPUs aim for exascale-level AI performance per rack
- Helios relies on liquid cooling and double-width chassis for thermal management
HPE has announced plans to integrate AMD’s Helios rack-scale AI architecture into its product lineup starting in 2026.
The collaboration gives Helios its first major OEM partner and positions HPE to ship full 72-GPU AI racks built around AMD’s next-generation Instinct MI455X accelerators.
These racks will be paired with EPYC Venice CPUs and will use a scalable Ethernet-based fabric developed with Broadcom.
Rack layout and performance goals
The move creates a clear commercial path for Helios and puts the architecture in direct competition with Nvidia’s rack-scale platforms already in service.
The Helios reference design is based on Meta’s Open Rack Wide standard.
It uses a double-width liquid-cooled chassis to house the MI450 series GPUs, Venice CPUs, and Pensando networking hardware.
AMD is targeting up to 2.9 exaFLOPS of FP4 compute per rack with the MI455X generation, along with 31TB of HBM4 memory.
The system presents each GPU as part of a single module, allowing workloads to span all accelerators without local bottlenecks.
A purpose-built HPE Juniper switch supporting Ultra Accelerator Link over Ethernet forms the high-bandwidth GPU interconnect.
It offers an alternative to Nvidia’s NVLink-centric approach.
The Stuttgart High Performance Computing Center has selected HPE’s Cray GX5000 platform for its next flagship system, called Herder.
Herder will use MI430X GPUs and Venice CPUs in direct liquid-cooled blades and will replace the current Hunter system in 2027.
HPE stated that waste heat from GX5000 racks will heat campus buildings, showing environmental considerations in addition to performance goals.
AMD and HPE plan to make Helios-based systems available globally next year, expanding access to rack-scale AI hardware for research institutions and enterprises.
Helios uses an Ethernet fabric to connect GPUs and CPUs, which contrasts with Nvidia’s NVLink approach.
Using Ultra Accelerator Link over Ethernet and Ultra Ethernet Consortium-aligned hardware supports scalable designs within an open standards framework.
Although this approach allows for GPU counts theoretically comparable to other high-end AI racks, performance under sustained multi-node workloads remains untested.
However, relying on a single Ethernet layer could introduce latency or bandwidth restrictions in real applications.
That said, these specifications do not predict real-world performance, which will depend on effective cooling, network traffic management, and software optimization.
Through Tom Hardware
Follow TechRadar on Google News and add us as a preferred source to receive news, reviews and opinions from our experts in your feeds. Be sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp also.




