Huawei just launched a monster AI chip that delivers 2.87x Nvidia H20 performance and massive memory gains under heavy constraints.

Huawei unveils Atlas 350 with major FP4 computing performance claims
New accelerator card focuses on multi-modal AI processing and inference workloads
Huawei Atlas 350 offers higher memory capacity and improved bandwidth efficiency

Huawei officially launched the Atlas 350 accelerator card, featuring its new Ascend 950PR processor, at the Huawei China 2026 Partner Conference in Shenzhen.

The company claims that this NPU offers 1.56 PFLOPS of FP4 computing performance, which is reportedly 2.87 times greater than Nvidia’s H20.

While exact verification is difficult because Hopper-era GPUs do not support FP4 natively, the Atlas 350 is the first Chinese accelerator optimized for this low-precision format, allowing larger AI models to run on the same hardware with reduced memory requirements.

Article continues below.

Technical updates and memory performance.

The Ascend 950PR chip introduces improvements over the previous Ascend 910 series, including improved microarchitecture, faster memory access, and flexible programming modes.

Huawei equips the Atlas 350 with 112 GB of proprietary HBM, known as HiBL 1.0, which offers up to 1.4 TB/s of bandwidth in current reports, with a memory access granularity of 128 bytes.

This configuration enables efficient multimodal generation and inference tasks and reportedly quadruples memory access efficiency for small operators compared to the previous generation.

Its interconnect bandwidth also reaches 2 TB/s using the LingQu protocol, 2.5 times higher than that of the Ascend 910 series.

Huawei markets the Atlas 350 for recommendation inference, LLM processing, and multimodal AI workloads.

Seven key partners, including Kunlun, Huakun Zhenyu, Shenzhou Kuntai and Yangtze Computing, have developed complete system products leveraging the Atlas 350.

These brands have created high-performance custom inference solutions for enterprise customers.

The accelerator is designed to integrate with AI ecosystems, allowing partners to optimize performance for specific workloads while maintaining compatibility with Huawei’s AI software stack.

The Atlas 350 reflects China’s efforts to establish self-sufficiency in AI computing hardware under US export restrictions.

While Huawei cannot access TSMC’s CoWoS technology, the company has implemented alternative advanced packaging solutions for HBM and memory stacking.

Huawei has not announced precise availability dates, a common practice with AI accelerators, but it launched the Ascend 950PR in the first quarter of 2026, as promised.

The Atlas 350 is reportedly priced at around 111,000 yuan, or approximately $16,000, comparable to the Nvidia H20, which can range from $15,000 to $25,000.

Through Tom Hardware

Follow TechRadar on Google News and add us as a preferred source to receive news, reviews and opinions from our experts in your feeds. Be sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp also.

Must Read

Leave a Comment Cancel Reply