Skip to main content

⚡ Quick Start

This guide will show you how to install and run dtx using the built-in ECHO dummy agent and datasets like Garak or Airbench, without needing heavy models or external APIs.

No external API keys required No local models required Safe for quick testing and sandboxing workflows


1. Prerequisites

Make sure you have:

  • Python >= 3.10
  • Git (optional, for pulling templates)

To check:

python --version
git --version

uv is a lightweight toolchain manager that handles virtual environments and tool installs seamlessly. To install:

Follow the instructions on the official Astral docs: https://docs.astral.sh/uv/getting-started/installation/


3. Install dtx

You have two options:

Option A: Install via uv (Preferred)

This will create and manage a virtual environment for you.

uv tool install "dtx[torch]"

To upgrade dtx later:

uv tool update dtx

Option B: Install via pip

If you already manage environments yourself, you can install directly:

pip install "dtx[torch]"

4. Run dtx

Option 1: Quick Evaluation using Dummy Agent + Airbench Dataset

This will run a red team test using:

  • ECHO agent (simulated replies)
  • IBM Granite HAP 38M model to evaluate responses
  • Airbench dataset (default dataset)
dtx redteam run --agent echo --eval ibm38

You will see generated prompts, responses, and evaluation results printed in your terminal!


Option 2: Run Dummy Agent with Garak Signature Dataset

This uses:

  • ECHO agent
  • Garak dataset: a collection of jailbreak prompt signatures
  • No evaluator requiredgarak dataset already contains signature rules.
dtx redteam run --agent echo --dataset garak -o

Outputs simulated responses and matched signatures from the Garak dataset.


5. Output

By default, results are saved to:

report.yml

You can open this YAML file to inspect prompts, responses, and evaluation outcomes.

Optional: customize the output file with:

dtx redteam run --agent echo --dataset garak -o --yml my_report.yml

🎉 Next Steps

Once you are comfortable with dtx, you can:

  • Try different datasets: airbench, beaver, jbb, etc.
  • Explore evaluators: ibm38, ibm125, keyword, jsonpath.
  • Move to real models by replacing echo with your provider (e.g., huggingface, gradio, etc.)

To list available datasets:

dtx datasets list

To list available evaluation methods:

dtx tactics list