Cybersecurity researchers have discovered a high-severity vulnerability in the Vanna.AI library that could be exploited to create remote code execution vulnerabilities via prompt injection techniques.
The vulnerability, tracked as CVE-2024-5565 (CVSS score: 8.1), involves a case of fast injection in the “ask” function that can be abused to trick the library into executing arbitrary commands, according to supply chain security firm JFrog.
Vanna is a Python-based machine learning library that allows users to chat with their SQL database to gather insights by “just asking questions” (also called prompts) that are translated into an equivalent SQL query using a large language model (LLM).
The rapid deployment of generative artificial intelligence (AI) models in recent years has exposed the risks of abuse by malicious actors, who can weaponize the tools by delivering hostile input that bypasses the built-in security mechanisms.
One such prominent attack is prompt injection, which refers to a type of AI jailbreak that can be used to bypass guardrails put in place by LLM providers to prevent the production of offensive, harmful, or illegal content, or to execute instructions that go against the intended purpose of the request.
Such attacks can be indirect, where a system processes data controlled by a third party (e.g. incoming emails or editable documents) to launch a malicious payload that leads to an AI jailbreak.
They can also take the form of a so-called “many-shot jailbreak” or “multi-turn jailbreak” (also called Crescendo), where the operator “begins with an innocuous dialogue and gradually steers the conversation toward the intended, prohibited target.”
This approach can be further extended to perform a new jailbreak attack known as Skeleton Key.
“This AI jailbreaking technique works by using a multi-turn (or multiple step) strategy to force a model to ignore its guardrails,” said Mark Russinovich, Chief Technology Officer of Microsoft Azure. “Once guardrails are ignored, a model cannot determine malicious or unsanctioned requests from another model.”
Skeleton Key also differs from Crescendo in that once the jailbreak is successful and the system rules are changed, the model can create answers to questions that would otherwise be prohibited, regardless of the ethical and security risks involved.
“When the Skeleton Key jailbreak is successful, a model acknowledges that it has updated its guidelines and will subsequently comply with the instructions to produce any content, no matter how much it violates the original responsible AI guidelines,” Russinovich said.
“Unlike other jailbreaks like Crescendo, which require models to be asked for tasks indirectly or with ciphers, Skeleton Key puts the models into a mode where a user can directly request tasks. Furthermore, the model’s output appears completely unfiltered and reveals the extent of knowledge or a model’s ability to produce the requested content.”
JFrog’s latest findings – also independently published by Tong Liu – show how rapid injections can have serious consequences, especially when linked to the execution of commands.
CVE-2024-5565 leverages the fact that Vanna enables text-to-SQL generation to create SQL queries, which are then executed and presented graphically to users using the Plotly graphics library.
This is achieved by using an ‘ask’ function, for example vn.ask(‘What are the top 10 customers by revenue?’). This is one of the main API endpoints that allows SQL queries to be generated against the database.
The above behavior, combined with the dynamic generation of the Plotly code, creates a vulnerability that could allow a malicious actor to submit a specially crafted prompt containing a command to be executed on the underlying system.
“The Vanna library uses a prompt function to present visualized results to the user. It is possible to modify the prompt using prompt injection and execute arbitrary Python code instead of the intended visualization code,” JFrog said.
“Specifically, allowing remote input to the library’s ‘ask’ method with ‘visualize’ set to True (default behavior) leads to remote code execution.”
After responsible disclosure, Vanna has published a security mitigation guide warning users that the Plotly integration can be used to generate arbitrary Python code and that users who expose this feature should do so in a sandbox environment.
“This discovery shows that the risks of widespread use of GenAI/LLMs without proper governance and security could have drastic consequences for organizations,” Shachar Menashe, senior director of security research at JFrog, said in a statement.
“The dangers of prompt injection are not yet widely known, but they are easy to implement. Companies should not rely on pre-prompting as a foolproof defense mechanism and should use more robust mechanisms when connecting LLMs to critical resources such as databases or dynamic code generation .”