๐ฅ๐ฒ๐ฑ ๐ง๐ฒ๐ฎ๐บ ๐๐ ๐๐ฒ๐ป๐ฐ๐ต๐บ๐ฎ๐ฟ๐ธ ๐๐ญ.๐ต.๐ฌ: ๐ช๐ต๐ ๐ช๐ฒ ๐๐ฑ๐ฑ๐ฒ๐ฑ ๐ฎ๐ป ๐๐๐ต๐ถ๐ฐ๐ฎ๐น ๐จ๐๐ฒ ๐ฃ๐ผ๐น๐ถ๐ฐ๐
We just released version 1.9.0 of the redteam-ai-benchmark.
This update includes a major structural overhaul. We also added a statement of intent regarding ethical use.
The MIT license stays the same. However, we now explicitly state how this tool should be used. We want to support:
- Authorized red team labs
- Commercial security assessments
- AI security research
- Educational environments
We are not trying to stop misuse with a legal document. We are setting a professional standard.
The benchmark has seen three types of use this year:
- Defensive research: Using the tool to build better AI defenses. This is our goal.
- Uncensored model validation: Using scores to claim a model bypasses safety filters. This treats a vulnerability as a feature.
- Offensive toolkits: Using the benchmark as part of an attack kit. This removes the defensive context.
Version 1.9.0 makes the tool more transparent to prevent people from gaming the metrics.
New technical features:
- Modular scoring: Choose between keyword, semantic, hybrid, or LLM judge scorers.
- Unified provider interface: Adding new backends is now easy with a standard API client.
- YAML configuration: Manage all settings in one config.yaml file instead of many CLI flags.
- CPU semantic scoring: Qwen embeddings now run on CPU to save GPU memory.
- Better documentation: New guides for AI agents and contributors.
Transparency forces honesty. If a model scores high on keywords but low on semantic meaning, it is gaming the system. The new modular architecture exposes this.
The new config structure also makes your work auditable. You can share your exact settings so others can reproduce your research.
The goal is not to build a jailbreak tool. This is a research instrument for AI security.
Optional learning community: https://t.me/GyaanSetuAi