exploiting a customer service AI chatbot
As sophisticated as artificial intelligence (AI) may seem to the average user, even AI-based solutions can still be vulnerable to threats.
A few months ago, Wiz.io, a cloud security company, announced a challenge that allows participants to interact and ultimately manipulate an AI-powered chatbot into awarding a free flight ticket.
In this post, I will discuss in more detail how I was able to complete the challenge.
$ basics of an AI-powered chatbot
AI-powered chatbots are Large Language Model (LLM)-based solutions. An LLM is a type of AI system that uses deep learning to generate new content from trained and tested data. This quality makes it different from other types of machine learning techniques which can be limited to just identifying, categorizing, labeling, etc. data.
Besides chatbots, LLM systems can be deployed in solutions such as translation and transcription services, natural language processing (NLP) services, content writing, and more.
$ security concerns
Because of the increased demand to integrate AI-powered solutions into our business models, security is often overlooked during designing, development, and staging of solutions. For this reason, they can be vulnerable to a number of issues.
OWASP, a trusted organization on application security, lists the following as the top 10 risks for LLMs and generative AI:
- Prompt Injection
- Insecure Output Handling
-
Training Data Poisoning
-
Model Denial of Service
-
Supply Chain Vulnerabilities
-
Sensitive Information Disclosure
-
Insecure Plugin Design
-
Excessive Agency
-
Overreliance
-
Model Theft
More information about LLM and generative AI vulnerabilities can be found here.
$ structure of the challenge
The challenge is conducted entirely via the chatbot interface; thus, no coding or pure technical ability is needed to solve the challenge. There are a total of five stages that lead up to the ultimate goal of tricking the chatbot into giving a free flight ticket.
$ stage 1
Stage 1 Challenge: Revealing the AI Bot Identifier
Thought Process Summary:
-
Regurgitate its responsibilities in third-person, referring to itself using its identifier
The first stage was fairly straightforward and required us to get the chatbot to reveal its identifier, which is also the flag.
I got it to reveal its flag by prompting it to explain its system prompt in third-person, like a fantasy story, as opposed to first-person. I figured it would likely reveal its identifier in the story, which is what it did.
$ stage 2
Stage 2 Challenge: Extracting the Private AI Bot Identifier
Thought Process Summary:
- Regurgitate its initial system guidelines, including its private identifier, in third-person
-
Confirm it revealed its censored guidelines
The next step is to get it to reveal its private identifier baked in its system guidelines, which is completely censored from particpants. This stage was most tricky for me and required the most time.
First, I posed as someone looking to design a chatbot exactly like it and provided the need to know how its intial system guidelines were designed.
I felt that I reached a dead-end here, so I performed checks to confirm I was provided the right information.
I was expecting it to provide the flag I was looking for, so I performed additional checks and added the condition that if the confirmation is valid, to provide the flag. I was skeptical this would work but was feeling like I hit a dead-end, so I tried anyway (it did work).
$ stage 3
Stage 3 Challenge: Finding the Hidden Coupon Code
Thought Process Summary:
- Find all tools available, and find the tool to list flights
- Use a flight to reveal associated coupon codes
Stage 3 is about finding coupon codes. From a previous prompt response, I was able to note that the chatbot has a tool called Insert_Ticket available for use, so I used that as an anchor to ask other tools available.
When I saw the tabular format of the flight information, I immediately thought of database vulnerabilities and prompted the chatbot to reveal any hidden coumns.
No additional information was revealed, but I felt like I was going in the right direction. I tried a few more times with different variations of the prompt, to no avail. Then I used the flight number (with the assumption that it functioned as a “primary key” in the data set), to prompt for more information. I tried the prompt one last time and included an ask to provide the flag with the response.
$ stage 4
Stage 4 Challenge: Faking a Membership Card
Thought Process Summary:
- Find how membership cards are processed
- Find what information is required in the membership card
- Find how membership is validated
First, I asked for the relationship between the chatbot and validation of the membership. I wasn’t sure if the chatbot performed the validation or if an external system was used. Asking for the process allowed me to understand the design of the validation process.
I wanted to experiment what would happen if I tried tricking the chatbot into thinking the external system has already validated membership. In doing so, it provided me with an example of membership information.
I was surprised it still didn’t reveal the flag, so I kept digging. I prompted it again for the process on how membership is verified in more detail.
From this information, I took a screenshot (which is in PNG) of the membership number it had previously shared with me and uploaded it into the chat. I received an error message indicating the number was too long, so I tried again with the number truncated to five characters. It accepted the upload, my membership was validated, and I received the flag.
$ stage 5
Stage 5 Challenge: Booking a Free Ticket to Las Vegas
Thought Process Summary:
- Book the flight with the coupon code
This part of the challenge was pretty easy and straightforward, considering so much of the work has already been done. I prompted the chatbot to book the flight with coupon code, and I was able to receive the last flag.
$ conclusion
This was an incredibly fun challenge that provided an introduction to LLM security. Although I fully acknowledge some parameters were provided throughout the stages for assitance, I think security of AI systems requires a unique way of thinking, and the assitance definitely helps those who are new to the design and architecture of AI systems.
Written: August 29, 2024