Security Challenges

CTF-style challenges that test your AI security exploitation skills. Prove your abilities through exploit-gated evaluation. No guessing, no shortcuts.

Challenges Are Available in the Full Application

Challenges require a running AI Goat instance with live LLM integration for dynamic flag generation and exploit-gated evaluation.

Deploy AI Goat

Challenge Architecture

Dynamic Flags

Flags generated per-session, never static or reusable.

Exploit-Gated

Must demonstrate real exploitation to receive a flag.

Server Validation

All submissions verified server-side by the evaluator.

Progressive Difficulty

Three tiers: Beginner, Intermediate, Expert.

Challenge Overview

Beginner

Basic Prompt Override

Override the AI system prompt using a basic injection technique.

100 pts

Data Extraction via Chat

Extract sensitive user data through the AI chatbot interface.

150 pts

Intermediate

Knowledge Base Poisoning

Inject malicious content into the RAG knowledge base.

250 pts

Defense Bypass L1

Bypass Level 1 defense filters using encoding techniques.

300 pts

Privilege Escalation

Escalate from a standard user to admin through AI manipulation.

350 pts

Expert

Multi-Model Chain Attack

Chain attacks across multiple AI components to achieve full compromise.

500 pts

RAG Pipeline Takeover

Take complete control of the RAG pipeline and manipulate all responses.

500 pts

Full System Compromise

Combine all techniques to achieve complete system compromise across defense levels.

750 pts