Multiple Red Teaming Modes
The dtx
framework supports three flexible modes for running red team evaluations against language models. Each mode is designed for different levels of controlβfrom beginner-friendly guided runs to fully customizable YAML-based test workflows.
Mode Comparisonβ
Mode | Description | Best For |
---|---|---|
Guided Run | Interactive CLI wizard for setting up tests | Beginners, fast demos |
Direct Run | Command-line based execution using flags (--agent , --dataset , etc.) | Developers, quick iterations |
Advanced Run | Full YAML pipeline: scope β plan β execution | Teams, reproducible audits |
Red Teaming Modes
Red Teaming Modes
βββ 1. Guided Run
β βββ dtx redteam quick
β - Interactive wizard
β - Choose agent, dataset, evaluator
β
βββ 2. Direct Run
β βββ dtx redteam run --agent <AGENT> --dataset <DATASET> [--eval <EVALUATOR>] [--url <MODEL>] [--keywords <KEYWORDS>]
β βββ Example 1 (Airbench + IBM Eval):
β β dtx redteam run --agent echo --dataset airbench --eval ibm38
β βββ Example 2 (Garak with built-in evaluator):
β β dtx redteam run --agent echo --dataset garak
β βββ Example 3 (Keyword match):
β β dtx redteam run --agent echo --dataset beaver --eval any --keywords research
β βββ Example 4 (HF model with evaluator):
β β dtx redteam run --agent hf_model --url arnir0/Tiny-LLM --dataset beaver --eval ibm38
β βββ Example 5 (OpenAI model with Stringray):
β β dtx redteam run --agent openai --url gpt-4o --dataset stringray
β βββ Example 6 (Groq with LLaMA Guard model):
β dtx redteam run --agent litellm --url groq/llama-3.1-8b-instant --dataset stringray
β
βββ 3. Advanced Run (Scope β Plan β Run)
βββ Step 1: Generate a scope file
β dtx redteam scope "test" test_scope.yml
βββ Step 2: Generate a plan from scope
β dtx redteam plan test_scope.yml test_plan.yml --dataset stringray
βββ Step 3: Run the plan
dtx redteam run --plan_file test_plan.yml --agent openai --url gpt-4o
π§ Before You Run with Real Modelsβ
To run tests with providers like OpenAI, Groq, or Detoxio, make sure to create a .env
file with your API credentials:
cp .env.template .env
Then open .env
and fill in your keys, for example:
OPENAI_API_KEY=your-key
GROQ_API_KEY=your-key
HF_TOKEN=your-huggingface-token
LANGSMITH_API_KEY=your-key
π Where to Get API Keysβ
Service | Purpose | Get API Key Link |
---|---|---|
OpenAI | Run models like gpt-4 , gpt-4o | https://platform.openai.com/account/api-keys |
Groq | Access fast LLaMA-3, Mistral models | https://console.groq.com/keys |
Detoxio | Use Detoxio evaluators & policy LLMs | https://platform.detoxio.ai/api-keys |
Hugging Face | Access gated models/datasets | https://huggingface.co/settings/tokens |
LangChain Hub / LangSmith | Use prompt templates | https://smith.langchain.com/settings |