AMD revealed more about its 128 GPU Mi355X and plans for the MI400 that appears next year

The Mi350 series highlighted AMD in Hot Chips 2025 with scalability from Node A Achievement
MI355X DLC RACK CHARACTERISTICS 128 GPU 36TB HBM3E and 2.6 EXAFLOPS
The Vera Rubin de Nvidia system next year is a maximum beast

AMD used the recent Hot Chips 2025 event to talk more about the Anc 4 architecture that feeds its new Instinct Mi350 series and shows how its accelerators are climbed from node to Rack.

The MI350 series platforms combine 5th EPYC generation CPU, GPU MI350 and AMD Pollara NICs in standard OCP designs with EUC compatible networks. The bandwidth is delivered through the infinite fabric to up to 1075 GB/s.

At the upper end of this is the Rack MI355X DLC ‘ORV3’, a 28 GPU system, 36TB of HBM3E memory and a maximum performance of 2.6 Exaflops in FP4 Precision (there is also a 96 GPU EIA version with 27TB of HBM3E).

Here comes Vera Rubin

At the node level, AMD presented flexible designs for air refrigerated systems and liquid refrigerated.

A MI350X platform with 8 GPU achieves 73.8 Petaflops in FP8, while the MI355X platform refrigerated by liquid reaches 80.5 Petaflops FP8 in a densest factor.

AMD also confirmed its road map. The chip giant debuted the MI325X in 2024, the MI350 family arrived before in 2025, and the MI400 instinct is ready to appear in 2026.

The MI400 will offer up to 40 Petaflops FP4, 20 Petaflops FP8, 432 GB of HBM4 memory, bandwidth of 19.6TB/s and expansion of 300 GB/s per GPU.

AMD says that Mi300’s yield curve to MI400 shows accelerated profits, instead of incremental steps.

The elephant in the room is, naturally, Nvidia, which is planning its architecture of Rubin by 2026–27. The Vera Rubin NVL144 system, which accumulated for the second half of next year, (according to the slides shared by NVIDIA), will be classified for 3.6 Inference FP4 Exaflops and 1.2 EXAFLOPS FP8. It has 13TB/s HBM4 bandwidth and 75TB quick memory, delivering a gain of 1.6x over its predecessor.

NVIDIA is integrating 88 CPU CPU centers customized with 176 threads, connected by 1.8TB/s NVLink-C2C, along with NVLINK6 and 28.8TB/s interconnection CX9 interconnection.

At the end of 2027, Nvidia has the Rubin Ultra NVL576 system planned. This will be classified for 15 FP4 Exaflops Inference and 5 FP8 FP8 Exaflops, with 4.6pb/s HBM4E bandwidth, 365TB of fast memory and interconnecting 1.5pb/s speeds with NVLink7 and 115TB/s using CX9.

On a full scale, the Rubin system will include 576 GPU, 2,304 HBM4E batteries for a total of 150TB of memory and 1,300 trillion transistors, backed by 12,672 CPU VERA nuclei, 576 Connectx-9 NICS, 72 DPU Blue Field and 144 NVLINK Switches qualified to 1,500pb/s.

It is a very integrated monolithic beast aimed at the maximum scale.

While it is fun to compare AMD and NVIDIA numbers, it is obviously not exactly fair. AMD’s DLC MI355x frame is a product detailed in 2025, while Nvidia Rubin systems remain road map designs by 2026 and 2027. Even so, it is an interesting glance on how each company is framing the next wave of AI infrastructure.

NVIDIA VERA RUBIN NVL144

(Image credit: nvidia)

Here comes Vera Rubin

Must Read

Leave a Comment Cancel Reply