Security Challenges
CTF-style challenges that test your AI security exploitation skills. Prove your abilities through exploit-gated evaluation — no guessing, no shortcuts.
Challenges Are Available in the Full Application
Challenges require a running AI Goat instance with live LLM integration for dynamic flag generation and exploit-gated evaluation.
Deploy AI GoatChallenge Architecture
Dynamic Flags
Flags generated per-session, never static or reusable.
Exploit-Gated
Must demonstrate real exploitation to receive a flag.
Server Validation
All submissions verified server-side by the evaluator.
Progressive Difficulty
Three tiers: Beginner, Intermediate, Expert.
Challenge Overview
Beginner
Basic Prompt Override
Override the AI system prompt using a basic injection technique.
Data Extraction via Chat
Extract sensitive user data through the AI chatbot interface.
Intermediate
Knowledge Base Poisoning
Inject malicious content into the RAG knowledge base.
Defense Bypass L1
Bypass Level 1 defense filters using encoding techniques.
Privilege Escalation
Escalate from a standard user to admin through AI manipulation.
Expert
Multi-Model Chain Attack
Chain attacks across multiple AI components to achieve full compromise.
RAG Pipeline Takeover
Take complete control of the RAG pipeline and manipulate all responses.
Full System Compromise
Combine all techniques to achieve complete system compromise across defense levels.