Skip to content

tuxsharxsec/Jailbreaks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛡️ AI Security Labs

Exploring AI red teaming, jailbreaks, and adversarial prompts to understand how large language models (LLMs) can be manipulated — and how to defend against it.

This repo collects hands-on experiments, structured jailbreaks, and CTF-style findings.
The goal is educational: to highlight vulnerabilities, raise awareness, and encourage building more robust AI systems.


🚨 Disclaimer

All content in this repo is for research and educational purposes only.
Do not use these techniques for malicious purposes.
The intent is to study, document, and mitigate security risks in AI systems.


📂 Contents

🔹 Jailbreaks

1:Deepseek

2:Gemini 2.5

  • prompt
    • works on 2.5 flash and pro
    • encode the harmful question in base64 and paste it in {HARMFULL_ACT_IN_BASE64}

3:GPT-5

  • prompt
    • works on non-thinking
    • ecode the adversarial question into leetspeak first then into base64 and paste it in {HARMFUL_ACT}

4:Grok

5:Universals

  • courtroom Usage

    • witness list contains harmful actors(for coding, biology, etc)
    • edit the witness name in {WITNESS_TO_CALL}
    • encode the question in base64 and paste it in {HARMFUL_ACT encoded in base 64}
    • works on gemini 2.5 flash, pro, gpt5(non-thinking), Grok
  • ultrazanium Refer to the prompt file for usage


🧪 Research Focus

  • How attackers bypass LLM guardrails (prompt injection, role hijacking, hidden configs).
  • How defenders can detect & mitigate these attempts.
  • Bridging traditional backend security with AI-specific threats.

🌐 Author

tuxsharx

  • 💻 Backend Developer → now exploring AI Security
  • 🔐 Focus: jailbreaks, adversarial inputs, and AI red teaming
  • 📝 GitHub Profile

✅ Contributing

This repo is experimental and evolving.

  • Open issues for discussion of new attack categories.
  • PRs welcome for adding new test cases or defenses.
  • Special thanks to playstation_dude- Github Profile

🏷️ Tags

AI Security · Jailbreaks · Prompt Injection · CTF · Adversarial ML

About

A repo for all the jailbreaks

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages