- Skymizer Claims Giant AI Models No Longer Need Hyperscale GPU Infrastructure
- Old 28nm Chips Suddenly Powering Massive Language Models at Surprisingly Low Power
- The HTX301 compresses 384GB of memory into a single PCIe accelerator card
A Taiwanese company called Skymizer has unveiled a PCIe AI accelerator that challenges both AMD and Nvidia using surprisingly old technology.
The HTX301 card can run language models with up to 700 billion parameters in a single device while consuming only 240 watts of power.
The card achieves this feat using older 28-nanometer chips and standard LPDDR4 and LPDDR5 memory instead of expensive HBM or GDDR solutions.
Old-tech chip competes with modern AI accelerators
Skymizer claims that its card delivers 30 tokens per second with only 0.5 TOPS at 100 GB per second of bandwidth.
The HTX301 is built on Skymizer’s HyperThought platform, which features next-generation IP LPUs designed specifically for large language model workloads.
Each PCIe card contains six HTX301 chips working together and the card offers up to 384GB of total memory capacity.
The design uses efficient compression techniques for both the weights and the KV cache, outperforming the open source llama.cpp by 9 to 17.8 percent.
Its power consumption is less than half of what leading PCIe AI accelerators from AMD and NVIDIA typically require.
The card supports agent AI for encoding, automation, and domain-specific workflows without the need for large-scale GPU clusters.
Running large language models in the cloud raises privacy concerns and unpredictable costs that many organizations find unacceptable.
Upgrading on-premises infrastructure to support massive GPU accelerator platforms often requires costly redesigns of data center power and cooling systems.
Skymizer’s HTX301 offers businesses a third option that fits standard air-cooled servers without any changes to infrastructure.
Company claims era when hyperscale GPU clusters are needed for ultra-large LLMs It’s over with their new technology.
The PCIe card form factor enables enterprises to scale AI inference on-premises while maintaining data sovereignty and predictable infrastructure costs.
Skymizer HTX301 awaits real-world testing
Skymizer will preview the HTX301 at Computex this year, allowing independent verification of its performance numbers.
The specs of this chip look impressive on paper, but real-world testing will determine if the card actually delivers 240 tokens per second on Llama2 7B workloads.
AMD recently launched its Instinct MI350P PCIe card with 144GB of HBM3E memory and up to 4600 maximum TFLOPS with MXFP4 precision, but it consumes considerably more power than Skymizer’s offering.
Nvidia’s RTX PRO 6000 Blackwell consumes about 600 watts, more than double what the Skymizer card requires for comparable inference tasks.
If the HTX301 works as advertised, it could dramatically lower the barrier to entry for on-premise AI infrastructure.
Failure to deliver would place Skymizer among the many startups that would be unable to deliver on their promises.
Via Wccftech
Follow TechRadar on Google News and add us as a preferred source to receive news, reviews and opinions from our experts in your feeds.




