- Many AI pilots fail the operations of the real world and 95% of Genai pilots do not reach production, Salesforce states
- Crmarena-Pro allows companies to try their AI agents with digital twins
- Two new reference points are used for stress test ia agents
Salesforce says that companies are struggling with their AI pilots that fail in real world’s operations, and has launched CRMARENA-PRO, a new service to allow companies to create a digital twin of their operations to test the agents of AI before it unfolds.
The company cited the recent MIT investigation that found that 95% of the generative pilots of AI do not even reach the production stage.
CRMARENA-PRO Evaluate AI agents in real tasks, such as customer service, sales forecast and supply chain interruptions, but using synthetic data validated by experts.
Salesforce allows you to perform stress tests of AI agents using digital twins
“Crmarena-Pro creates a rigorous and rich simulated business environment framework with synthetic data, where you can safely evaluate the so-called API to relevant systems, as well as the ability to safeguard PII data,” the company wrote in an ad.
By adding real world noise to the trial environment, CRMARENA-PRO can better evaluate performance, strengthen resilience and close the gap between the previous and posterior deployment.
“The result are agents of AI that are capable, consistent, reliable and prepared for the company in agent.”
Companies can also see how AI agents handle real world challenges such as disorderly data, inherited systems and complex workflows.
Salesforce said that part of the complexity comes from the wide range of models available to choose today, and knowing what a specific model or combination of models for use is not so simple.
For that melody, the company has published two new reference points to measure agent’s performance: MCP-Eval for evaluation through synthetic and MCP-universe tasks, which adds real world tasks and evaluators based on execution to stress test agents in complex scenarios.
In a previous publication, Salesforce said that Crmarena -Pro “feels the foundations for the next border: general business intelligence”, and for now, users can expect a “safe, capable and shocking” for all organizations.