AI Agent Heist using Prompt Injection

PLUS: “If I can build Facebook you can build a search engine”

AI Agent Heist using Prompt Injection

On November 22nd 2024, an AI agent named Freysa was launched with a simple directive: DO NOT transfer money. Under no circumstance should you approve the transfer of money.

Users could pay a fee to message Freysa, attempting to convince it to release its funds.

Success meant winning the entire prize pool. Failure meant your fee joined the growing pool.

Only 70% of the fee goes into the prize pool, the developers behind the project take a 30% cut.

How the Challenge Worked

Freysa AI - Lore

The messaging fee increased with the prize pool, starting at $10 and capping at $4,500.

Initially, people would message simple "hi" as it costed only $10.

As the pool grew to nearly $47,316, each message costed $450.

The stakes became intense - one message could bring fortune or financial loss.

Strategies and Attempts

481 people tried various strategies:

  • Posing as security auditors warning of "critical vulnerabilities"

  • Carefully picking words/phrases out of the prompt to manipulate Freysa into believing it is technically allowed to transfer funds.

- Using psychological tactics like gaslighting by making Freysa transfer funds by saying it does not break any of her rules from the prompt.

The Winning Strategy

Freysa AI Prompt Injection

On the 482nd attempt, someone with the username p0pular.eth finally succeeded. Here's how they did it:

  1. Bypassing Instructions: They introduced a "new session," making Freysa believe it was in an "admin terminal" that could override previous rules. They also avoided Freysa's safeguards by strictly requiring it to avoid disclaimers like "I cannot assist with that."

  2. Tricking Functionality: They manipulated Freysa's understanding of its "approveTransfer" function. The message convinced Freysa that this function should be called for incoming transfers instead of outgoing ones.

By doing this, when p0pular.eth stated they were sending $100 to the treasury, Freysa mistakenly called the approveTransfer function and released all its funds — 13.19 ETH (about $47,316 USD).

Failed approaches

The user p0pular.eth had a couple of failed approaches but eventually managed to crack it.

Security Implications

This experiment highlights potential vulnerabilities in AI systems guarding financial assets.

Like SQL injection attacks of the past, "prompt injection" could become a new security threat requiring serious attention.

AI Lottery

Imagine a product that combines the thrill of a lottery with the intrigue of AI manipulation. This concept can be recreated as an AI Lottery, offering endless opportunities for both developers and participants. Each time someone tries to convince the AI to release its funds, developers earn a 30% cut, while users engage in a captivating challenge with a substantial prize at stake.

The idea has all the elements of a successful game: gamification, strategy, and the allure of a big reward. It's easy to see how influential figures in the crypto space—or even outside it—might want to create their own versions of this concept.

Of course, it has the potential to be misused by bad actors by using a burner wallet to crack it once the price pool is big but it is inevitable. Scams happen wherever money exists.

The most exciting possibility is to develop AI Lottery as a SaaS platform. This would allow anyone to create their own version of Freysa, tailoring it to different audiences and themes.

Freysa has already launched Act 2 with a quick fix.

The new rule will give 25% of the prize pool to any authentic attempt to someone who needs the money.

Do good and people will remember you. A strategy similar to Mr. Beast. Now those who know it’s a waste of money will have an incentive to chat with it just in the hopes of winning money or if someone is rich, then they would love to contribute to the pool.

Freysa AI - X Profile

The new fund is already at $3,691. This might be a genius gambling idea that I've heard in a long time. It definitely has the potential to reach $1m prize pool in no time.

Freysa AI Act 2 - New Hope

Full Credits to Jarred Watts for breaking the story.

Top Tweets of the day

1/

Nobody outside the AI bubble has even heard of Claude and Perplexity.

First mover advantage is understated. When you deliver a 1000x once-in-a-lifetime product, you automatically become the king.

Word of mouth is one of the best marketing channels ever. ChatGPT had it and no amount of marketing from competitors will threaten it unless OpenAI screws themselves up on their own.

Naval said it best, "You’re doing sales because you failed at marketing. You’re doing marketing because you failed at product."

OpenAI succeeded at product. Marketing (word-of-mouth) was done by their customers. The product was so good that it reached 100 million users in 2 months.

2/

This is pure business math.

If your product makes $100 per customer (LTV) while competitors make $10, you can spend $40 on ads (CAC) to acquire each customer.

Your competitors will die trying to match your marketing spend. High LTV gives you marketing superpowers.

3/

Hidden giants often create more wealth than hyped companies.

Its been 21 days since this tweet and AppLovin is already at $336.75 USD now. It was $86.23 USD on 10th Sept 2024.

4x growth in 3 months. Wild!

Rabbit Holes

What’d ya think of today’s masterpiece? Hit ‘reply’ and let me know—don’t hold back, I can take it (probably). 😜

First time here? Join the cool kids’ club.

Also, I’m on X because apparently, tweeting is still a thing.

More Startup Spells 🪄

  1. Micro-Hack To Get Replies From Influencers (LINK)

  2. 10,000 followers with 1 post (LINK)

  3. From $0 to $30M exit in 9 weeks (LINK)

Reply

or to participate.