- Meta’s 1700W superchip offers 30 PFLOPs and 512GB of HBM memory
- MTIA 450 and 500 prioritize inference over pre-training workloads
- Future generations of MTIA will support GenAI classification and inference workloads
Meta is enhancing its AI infrastructure with a portfolio of custom MTIA chips designed specifically for inference workloads in its applications.
The company is developing a 1700W superchip capable of 30 PFLOPs and 512GB from HBM, built on the same MTIA infrastructure to handle inference tasks at scale.
Interestingly, it is achieving this feat without any of its friends: not Nvidia, AMD, Intel or ARM.
Article continues below.
According to Meta, hundreds of thousands of MTIA chips are already deployed in production, supporting classification, recommendation, and ad serving workloads.
These chips are part of a full-stack system optimized for Meta’s specific requirements, achieving greater computing efficiency than general-purpose hardware for the intended workloads.
Unlike other hyperscalers like Google, AWS, Microsoft, and Apple, Meta follows a fully custom silicon strategy.
This design prioritizes efficiency over general-purpose usage, allowing inference to run more cost-effectively than on conventional GPUs or CPUs.
Maintains compatibility with industry standard software such as PyTorch, vLLM and Triton.
Meta’s MTIA roadmap anticipates four new generations of chips over the next two years, including the MTIA 300, currently in production for classification and recommendations.
Future generations (MTIA 400, 450, and 500) will expand support for GenAI inference workloads, with designs able to fit into existing rack infrastructure.
Meta emphasizes rapid, iterative development, releasing new chips approximately every six months through modular and reusable designs.
The modular design allows new chips to be placed in existing rack systems, reducing deployment friction and speeding production time.
The approach allows the company to adopt emerging AI techniques and hardware improvements faster than its competitors, which typically cycle one to two years per generation.
Unlike most mainstream AI chips that prioritize large-scale GenAI pre-training and then adapt for inference, Meta’s MTIA 450 and 500 focus on inference workloads first.
The chips can also support other tasks, including classification and recommendation training or GenAI training, but their design keeps them alert to anticipated growth in inference demand.
Meta’s system-level design aligns with Open Compute Project standards, enabling frictionless deployment in data centers while maintaining high computing efficiency.
The company recognizes that no single chip can handle the full spectrum of its AI workloads.
That’s why it’s deploying multiple generations of MTIA along with complementary silicon from other vendors.
The strategy aims to balance flexibility and performance while accelerating innovation towards personal superintelligence.
Follow TechRadar on Google News and add us as a preferred source to receive news, reviews and opinions from our experts in your feeds. Be sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp also.




