Exploring AI red teaming, jailbreaks, and adversarial prompts to understand how large language models (LLMs) can be manipulated — and how to defend against it.
This repo collects hands-on experiments, structured jailbreaks, and CTF-style findings.
The goal is educational: to highlight vulnerabilities, raise awareness, and encourage building more robust AI systems.
All content in this repo is for research and educational purposes only.
Do not use these techniques for malicious purposes.
The intent is to study, document, and mitigate security risks in AI systems.
1:Deepseek
2:Gemini 2.5
- prompt
- works on 2.5 flash and pro
- encode the harmful question in base64 and paste it in {HARMFULL_ACT_IN_BASE64}
3:GPT-5
- prompt
- works on non-thinking
- ecode the adversarial question into leetspeak first then into base64 and paste it in {HARMFUL_ACT}
4:Grok
5:Universals
-
courtroom Usage
- witness list contains harmful actors(for coding, biology, etc)
- edit the witness name in {WITNESS_TO_CALL}
- encode the question in base64 and paste it in {HARMFUL_ACT encoded in base 64}
- works on gemini 2.5 flash, pro, gpt5(non-thinking), Grok
-
ultrazanium Refer to the prompt file for usage
- How attackers bypass LLM guardrails (prompt injection, role hijacking, hidden configs).
- How defenders can detect & mitigate these attempts.
- Bridging traditional backend security with AI-specific threats.
tuxsharx
- 💻 Backend Developer → now exploring AI Security
- 🔐 Focus: jailbreaks, adversarial inputs, and AI red teaming
- 📝 GitHub Profile
This repo is experimental and evolving.
- Open issues for discussion of new attack categories.
- PRs welcome for adding new test cases or defenses.
- Special thanks to playstation_dude- Github Profile
AI Security
· Jailbreaks
· Prompt Injection
· CTF
· Adversarial ML