"Illusions" affect "reliability"! Salesforce executives say "trust in large models has declined," and usage has decreased

Wallstreetcn
2025.12.22 00:22
portai
I'm PortAI, I can summarize articles.

Salesforce has encountered several challenges with large model technology in practical applications. When given more than 8 instructions, it begins to omit instructions, which is not ideal for tasks that require precise handling. Additionally, AI can experience a phenomenon known as "drift," where "when users ask unrelated questions, the AI agent loses focus on its primary objectives."

Executives at enterprise software giant Salesforce have acknowledged that their trust in large models has declined over the past year. The company is reducing its reliance on generative AI in its main AI product, Agentforce, and is instead adopting more fundamental "deterministic" automation technologies to enhance software reliability.

On Monday, The Information reported that Sanjna Parulekar, Senior Vice President of Product Marketing at Salesforce, stated, "All of us had more confidence in large language models a year ago." The company is now using deterministic automation based on predefined instructions in Agentforce, rather than fully relying on the reasoning and interpretive capabilities of AI models.

This strategic adjustment aims to address technical failures such as "hallucinations" that large models encounter when handling precise tasks, ensuring that critical business processes follow the exact same steps every time. Salesforce's website now emphasizes that Agentforce can help "eliminate the inherent randomness of large models."

As one of the most valuable software companies, Salesforce's partial retreat from large models may impact thousands of businesses using this technology, with Agentforce currently expected to generate annual revenue exceeding $500 million.

Technical Reliability Challenges Drive Strategic Shift

Salesforce has encountered several technical challenges with large models in practical applications. Muralidhar Krishnaprasad, Chief Technology Officer of Agentforce, pointed out that when given more than eight instructions, large models begin to omit instructions, which is not ideal for tasks requiring precise handling.

The experience of home security company Vivint corroborates these issues. The company uses Agentforce to handle customer support for 2.5 million customers but has faced reliability problems. For instance, despite being instructed to send a satisfaction survey to customers at the end of each interaction, Agentforce sometimes fails to send the survey for unspecified reasons.

To address such issues, Vivint collaborated with Salesforce to set up "deterministic triggers" within Agentforce to ensure that surveys are sent every time. This form of basic automation not only reduces operational costs but also provides customers with lower prices.

Addressing AI "Drift" Phenomenon

Salesforce executive Phil Mui described another key challenge in a blog post in October: the AI "drift" phenomenon. According to Mui, the company's "most complex customers" encounter difficulties when using AI, as "when users ask unrelated questions, the AI agent loses focus on its primary objectives."

For example, an AI chatbot programmed to guide customers in filling out forms may "lose focus" when customers ask questions unrelated to the form. To address this issue, Salesforce developed the Agentforce Script system to minimize the "unpredictability" of large language models by identifying which tasks can be handled by "agents" not using large models The system is currently in the testing phase, aimed at ensuring that AI agents can remain focused on core tasks when faced with deviation issues.

Adjustments and Optimizations in Practical Applications

In its own operations, Salesforce has also adjusted the extent of its use of large models. Although CEO Marc Benioff previously stated that the Agentforce, which partially relies on OpenAI's large models, now handles most customer service inquiries for Salesforce, allowing the company to cut about 4,000 customer service personnel, the company seems to have recently reduced its customer service agents' use of large models.

For example, last week, when responding to requests for assistance with Agentforce technical issues, the company displayed a list of blog post links instead of asking for more information or discussing potential problems. This response method is similar to how businesses have used basic chatbots for years to handle customer or website visitor inquiries.

A Salesforce spokesperson stated that the company has "refined the topic structure, strengthened protective measures, improved retrieval quality, and adjusted responses to be more specific, contextually relevant, and aligned with real customer needs" this year. The spokesperson noted that the number of customer issues resolved with the help of agents is greater than ever, and it is expected that the number of resolved conversations will grow by 90% in the fiscal year ending in late January.

This trend reflects the challenges faced by the entire industry. Earlier this month, a chatbot powered by the enterprise AI startup Sierra answered questions about adult products and Nazi Germany for Gap Inc., highlighting the widespread issue of large models deviating from their intended use