Deepseek x brains: how the most controversial model is at this time being supercharged by the most powerful superchip ever built

The fastest fastest manufacturer in the world makes a splash with a brief welcome on board
Brains
Deepseek R1 will be executed in Cloud brains and the data will remain in the US.

Brains has announced that it will support Depseek in a not -so -surprising movement, more specifically the R1 70b reasoning model. The movement occurs after Groq and Microsoft confirmed that they would also take the new boy from the AI block to their respective clouds. AWS and Google Cloud have not yet done so, but anyone can run the open source model anywhere, even locally.

The inference chip specialist will execute Deepseek R1 70b at 1,600 tokens/second, which states that it is 57 times faster than any R1 provider that uses GPU; One can deduce that 28 tokens/second is what the GPU solution in the cloud (in that case deepfra) apparently reaches. By chance, the last brain chip is 57 times larger than H100. I have communicated with the brains for more information about that statement.

Brain research also showed that Depseek is more precise than OpenAi models in a series of tests. The model will be executed in brain hardware in data centers based in the USA. UU. To calm the privacy concerns that many experts have expressed. Deepseek, the application, will send its data (and metadata) to China, where it will probably be stored. Nothing surprising here, since almost all applications, especially free, capture user data for legitimate reasons.

The solution of the brain wafering scale is uniquely positioned to benefit from the imminent Boom of Inference of Clouds of AI. Wse-3, which is the fastest AI chip (or HPC accelerator) of the world, has almost a million nuclei and an amazing four trillion transistors. However, what is more important, has 44 GB of SRAM, which is the fastest available memory, even faster than the HBM found in the NVIDIA GPUs. Since Wse-3 is just a great dice, the available memory band is huge, several more large orders than what the NVIDIA H100 can gather (and for the H200 case).

A price war is brewing before the launch of Wse-4

Prices have not yet been revealed, but the brains, which are usually shy about that particular detail, disclosed last year that calls 3.1 405b in the brains it cost $ 6/million input tokens and output tokens of $ 12/million . Wait for Deepseek to be available for much less.

Wse-4 is the next WSE-3 iteration and will offer a significant impulse in the Deepseek yield and similar reasoning models when expected to be launched in 2026 or 2027 (depending on market conditions).

It is likely that Deepseek’s arrival also shake the proverbial money from AI, to bring more competition to established players such as OpenAi or Anthrope, which pushes prices.

A quick look at the Docsbot.AI LLM API calculator almost always shows the most expensive in all configurations, sometimes by several orders of magnitude.

(Image credit: brains)

Must Read

Leave a Comment Cancel Reply