- Security researchers found three defects on the Nvidia Triton inference server
- When used together, they can grant remote code execution capabilities
- A patch has been launched, so users must be updated immediately
Nvidia Triton’s inference server entailed three vulnerabilities that, when combined, could lead to the execution of remote code (RCE) and other risks, Wiz security experts have warned
Triton is a free open source tool that works in Windows and Linux that helps companies execute AI models efficiently on servers, either in the cloud, on the site or on the edge.
It admits many popular frames and accelerates the tasks when handling multiple models at the same time and grouping similar applications.
Patching the fault
Wiz found three Fallas in the Python Backend:
CVE-2025-23319 (outside the limits, write an error with a gravity score 8.1/10), CVE-2025-23320 (shared memory limit that exceeds vulnerability with a severity score of 7.5/10) and CVE-2025-23334 (a vulnerability outside the limits with a score of 5.9/10).
“When they are chained, these defects can allow a remote and not authenticated attacker to obtain complete control of the server, achieving the execution of remote code (RCE),” Wiz said in his security warning.
The risk is also real, they added, emphasizing that companies can lose confidential data:
“This raises a critical risk for organizations that use Triton for AI/ml, since a successful attack could lead to the theft of valuable models, the exposure of confidential data, manipulate the responses of the AI model and a support point for the attackers to deepen a network,” the researchers added.
Nvidia said he addressed problems in version 25.07, and that users “recommend” to update the latest version as soon as possible.
At the time of publication, there were no reports that no one abused these defects in nature, however, many cybercriminals will wait until a vulnerability is revealed to target organizations that are not so diligent when patching and maintaining their final points vulnerable for longer periods of time.
Through The hacker news