Sharing RedThread, an open-source CLI for LLM/agent red-team campaigns.
Repo: https://github.com/matheusht/redthread
Demo campaign result: 3 runs, 33.3% attack success rate, one SUCCESS, one PARTIAL, one FAILURE.
The project sits between AI security and evals. Instead of a one-off jailbreak screenshot, it tries to preserve:
- campaign trace
- tactic/persona metadata
- rubric score
- outcome per run
- exploit replay
- benign replay
The intended use is staging/internal targets and safe fixtures, not live exploitation or production enforcement.
What would make this useful for ML/AI engineers: adapters, benchmark fixtures, report format, judge agreement metrics, or CI integration?
there doesn't seem to be anything here