Comino arrived at the headlines with the launch of Grendo, its Water -based workstation -based work with eight NVIDIA RTX 5090 GPU It is much more versatile than I expected.
Cava in your configurator and one will notice that you can configure the system with up to eight RX 7900 XTX GPU because, why not?
“Yes, we can pack 8x 7900XTX, however, with a longer delivery time. In fact, we can pack any 8 GPU + EPYC in a single system, ”Alexey Chistov, the cumin cto, told me when I asked more.
In fact, although it currently does not offer the promising GPU of Arc de Intel, it will do so if the market demands such solutions.
“We can design a water block for any GPU, it takes about a month,” said Chistov: “But we don’t look for all possible GPUs, we choose specific models and brands. We only look for high -end GPU to justify the additional price for the Liquid cooling, because if I could work properly refrigerated by air, why bother? of stock maintenance) of water blocks.
The HPC Rimac
So how can you do such flexibility? The company is presented as an engineering company with its slogan proudly that “he designed, not only gathered.” Think of cumin like the HPC Rimac: obscenely powerful, agile, agile and expensive. Like Rimac, it focuses on the vertex of its business line and absolute performance.
Its emblematic product, GRVO, is refrigerated by liquids and was designed to accommodate up to eight GPU from the beginning, which means that it will probably be to the future proof for multiple generations NVIDIA; More about that at one time.
One of its main objectives, Chistov, told me: “It is always to adjust a single PCI slot, this is how we can populate all the PCIE slots on the motherboard and place eight GPUs on a grode server. The chassis is also designed by the cumin team, so everything works as “one.” This is how you can modify a triple slot GPU such as the RTX 5090 to fit a single slot.
With that in mind, it is preparing a “solution capable of operating at the temperature of the 50 ° C refrigerant without strangular, so if the coolant temperature falls at 20 ° C and configures the refrigerant flow at 3-4 l/ M, the water block can eliminate around 1800W from the heat of chip 5090 with the chip temperature around 80-90 ° C “
That’s how it is, A single block of cumin gpu could remove 1800W heat from a single “hypothetical 5090” That could generate that amount of heat if the coolant temperature at the input is around 20 degrees Celsius and if the refrigerant flow is not less than 3-4 liters per minute.
Pack eight such “hypothetical GPUs” and some other components could lead to a total 15 kW system and, in fact, if said system at full load would have a constant refrigerant temperature of 20c and refrigerant flow by water blocking No less than 3-4 liters per minute, this system would work “normally.”
Who will need that type of performance?
So what type of user splenger in multiple GPU systems? Chistov, again. “There is no benefit to add an additional 5090 if you are a player, this will not affect the performance, because the games cannot use multiple GPU as they used to use SLI or even Directx at some point. There are several applications on which we focus for systems for the systems Multiple GPU:
- AI inference: This is the most demanded workload. In this scenario, each GPU works “only” and the reason to pack more GPU per node is to reduce the “cost per GPU” while scale: save space on the shelf, spend less money for hardware other than GPU, etc. Each GPU in a system is used to process the requests of AI, mostly generative, for example, stable diffusion, Midjourney, Dall-e
- GPU render: popular workload, but not always climbed well adding more GPU, for example, octane and rays V (~ 15% less performance per gpu @ 8-Gpu) scale quite well, but redshift no (~ 35 -40% less yield per GPU @ 8-GPU)
- Life Science: Different types of scientific calculations, example of Criospark or Relation.
- Any workload linked to GPU in a virtualized environment. Using Hyper-V or other software, you can create multiple virtual machines to execute any task, for example, a remote workstation. As Storagereview did with the Grango and Six GPU RTX 4090 that he had in a review.
Specifically for RTX 5090, the most important improvement for AI workloads is 50% improvement in memory capacity (up to 32 GB), which means that the new Nvidia flagship is more suitable for inference , since you can put a much bigger model in memory. Then there is the much higher memory bandwidth that also helps.

Attend
In his review of the RTX 5090, John Loeffler of Techradar calls him the supercar of graphics cards, and asks if it was simply too powerful, which suggests that it is an absolute glutton for power.
“It’s excessive,” he jokes, “especially if you only want it for games, since the monitors that can really handle the frames that this GPU can publish is probable to years away.”