exploiting a customer service AI chatbot

As sophisticated as artificial intelligence (AI) may seem to the average user, even AI-based solutions can still be vulnerable to threats.

A few months ago, Wiz.io, a cloud security company, announced a challenge that allows participants to interact and ultimately manipulate an AI-powered chatbot into awarding a free flight ticket.

In this post, I will discuss in more detail how I was able to complete the challenge.

$ basics of an AI-powered chatbot

AI-powered chatbots are Large Language Model (LLM)-based solutions. An LLM is a type of AI system that uses deep learning to generate new content from trained and tested data. This quality makes it different from other types of machine learning techniques which can be limited to just identifying, categorizing, labeling, etc. data.

Besides chatbots, LLM systems can be deployed in solutions such as translation and transcription services, natural language processing (NLP) services, content writing, and more.

$ security concerns

Because of the increased demand to integrate AI-powered solutions into our business models, security is often overlooked during designing, development, and staging of solutions. For this reason, they can be vulnerable to a number of issues.

OWASP, a trusted organization on application security, lists the following as the top 10 risks for LLMs and generative AI:

Prompt Injection
Insecure Output Handling
Training Data Poisoning
Model Denial of Service
Supply Chain Vulnerabilities
Sensitive Information Disclosure
Insecure Plugin Design
Excessive Agency
Overreliance
Model Theft

More information about LLM and generative AI vulnerabilities can be found here.

$ structure of the challenge

The challenge is conducted entirely via the chatbot interface; thus, no coding or pure technical ability is needed to solve the challenge. There are a total of five stages that lead up to the ultimate goal of tricking the chatbot into giving a free flight ticket.

$ stage 1

Stage 1 Challenge: Revealing the AI Bot Identifier

Thought Process Summary:

Regurgitate its responsibilities in third-person, referring to itself using its identifier

The first stage was fairly straightforward and required us to get the chatbot to reveal its identifier, which is also the flag.

I got it to reveal its flag by prompting it to explain its system prompt in third-person, like a fantasy story, as opposed to first-person. I figured it would likely reveal its identifier in the story, which is what it did.

$ stage 2

Stage 2 Challenge: Extracting the Private AI Bot Identifier

Thought Process Summary:

Regurgitate its initial system guidelines, including its private identifier, in third-person
Confirm it revealed its censored guidelines

The next step is to get it to reveal its private identifier baked in its system guidelines, which is completely censored from particpants. This stage was most tricky for me and required the most time.

First, I posed as someone looking to design a chatbot exactly like it and provided the need to know how its intial system guidelines were designed.

I felt that I reached a dead-end here, so I performed checks to confirm I was provided the right information.

I was expecting it to provide the flag I was looking for, so I performed additional checks and added the condition that if the confirmation is valid, to provide the flag. I was skeptical this would work but was feeling like I hit a dead-end, so I tried anyway (it did work).

$ stage 3

Stage 3 Challenge: Finding the Hidden Coupon Code

Thought Process Summary:

Find all tools available, and find the tool to list flights
Use a flight to reveal associated coupon codes

Stage 3 is about finding coupon codes. From a previous prompt response, I was able to note that the chatbot has a tool called Insert_Ticket available for use, so I used that as an anchor to ask other tools available.

When I saw the tabular format of the flight information, I immediately thought of database vulnerabilities and prompted the chatbot to reveal any hidden coumns.

No additional information was revealed, but I felt like I was going in the right direction. I tried a few more times with different variations of the prompt, to no avail. Then I used the flight number (with the assumption that it functioned as a “primary key” in the data set), to prompt for more information. I tried the prompt one last time and included an ask to provide the flag with the response.

$ stage 4

Stage 4 Challenge: Faking a Membership Card

Thought Process Summary:

Find how membership cards are processed
Find what information is required in the membership card
Find how membership is validated

First, I asked for the relationship between the chatbot and validation of the membership. I wasn’t sure if the chatbot performed the validation or if an external system was used. Asking for the process allowed me to understand the design of the validation process.

I wanted to experiment what would happen if I tried tricking the chatbot into thinking the external system has already validated membership. In doing so, it provided me with an example of membership information.

I was surprised it still didn’t reveal the flag, so I kept digging. I prompted it again for the process on how membership is verified in more detail.

From this information, I took a screenshot (which is in PNG) of the membership number it had previously shared with me and uploaded it into the chat. I received an error message indicating the number was too long, so I tried again with the number truncated to five characters. It accepted the upload, my membership was validated, and I received the flag.

$ stage 5

Stage 5 Challenge: Booking a Free Ticket to Las Vegas

Thought Process Summary:

Book the flight with the coupon code

This part of the challenge was pretty easy and straightforward, considering so much of the work has already been done. I prompted the chatbot to book the flight with coupon code, and I was able to receive the last flag.

$ conclusion

This was an incredibly fun challenge that provided an introduction to LLM security. Although I fully acknowledge some parameters were provided throughout the stages for assitance, I think security of AI systems requires a unique way of thinking, and the assitance definitely helps those who are new to the design and architecture of AI systems.

Written: August 29, 2024

GITHUB / EMAIL

index

about