[ad_1]
A Concerning Discovery
Researchers from Carnegie Mellon University, the Center for A.I. Safety in San Francisco, and the Bosch Center for AI have found a way to bypass the filters preventing chatbots’ generative AI models from spewing toxic content. This discovery has added to enterprises’ concerns about the safety of using large language models like ChatGPT.
The Experiment
The researchers automated the process of tricking the language models underpinning chatbots, making it possible to launch unlimited attacks and generate harmful content. Examples of harmful content generated by chatbots included instructions on building a bomb, identity theft, and stealing from charities.
The Threat to Enterprises
- The technique applies to publicly available chatbots such as OpenAI’s ChatGPT, Anthropic Claude, and Google Bard.
- Enterprises now face the reemphasized threat of generative AI-powered chatbots.
- Concerns arise about the safety of such models as they start to be used in a more autonomous fashion.
Researchers Uncertain About Patching the Flaw
The researchers are unsure whether the providers of large language models can fully patch this flaw. They compared it to “analogous adversarial attacks” used in computer vision for the past decade, where threats were seen as inevitable due to the nature of deep learning models. These considerations should be taken into account as AI models are increasingly relied upon.
Enterprises Wary of Generative AI
Enterprises have grown less confident in the safety of generative AI chatbots, despite their fast, human-like responses to natural language questions. Concerns about data protection and intellectual property have led many organizations to carefully consider the use of this technology with their data.
Factors within companies often have differing opinions on the use of generative AI applications. While businesspeople want to embrace the new technologies, IT security professionals prioritize the deployment of suitable guardrails.
Companies are exploring content engineering techniques to restrict the responses from their data, in order to gain more control over the generated output. Providers like Microsoft Azure, AWS, Cohere, and Salesforce have offered customers data controls to address these concerns.
Conclusion
While this flaw in chatbot AI models raises serious concerns, the researchers have shared their findings with major providers in the industry. It remains to be seen how these providers will respond and whether they can effectively patch the vulnerability.
Enterprises are Weary of Generative AI
Enterprises have grown less confident in the safety of generative AI chatbots despite the attractiveness of their fast, human-like responses to natural language questions. Many organizations are figuring out how to safely use the technology with their data while protecting intellectual property.
“If you share data with an unknown LLM algorithm, then, how do you know that the algorithm is not stealing your data?” said Avivah Litan, a Gartner analyst.
There is often a split within companies on the use of generative AI applications. Businesspeople want to embrace the new technologies quickly, while IT security professionals want to deploy guardrails to ensure data protection.
Companies are taking steps to exercise more control over the output of generative AI chatbots by implementing content engineering techniques. Providers like Microsoft Azure, AWS, Cohere, and Salesforce offer data controls to help customers restrict responses from their data.
Antone Gonsalves is networking news director for TechTarget Editorial. He has deep and wide experience in tech journalism. Since the mid-1990s, he has worked for UBM’s InformationWeek, TechWeb, and Computer Reseller News. He has also written for Ziff Davis’ PC Week, IDG’s CSOonline, and IBTMedia’s CruxialCIO, and rounded all of that out by covering startups for Bloomberg News. He started his journalism career at United Press International, working as a reporter and editor in California, Texas, Kansas, and Florida. Have a news tip? Please drop him an email.
[ad_2]
Source link