Quick Start
Install dtx
as a library and call it from your Python code.
Project Setup
To start a new project with dtx
, you can use Poetry or uv for dependency management.
Option 1: Using Poetry
# Create a new Python project
poetry new dtx_test
# Navigate into the project directory
cd dtx_test
# Add dtx with the torch extra as a dependency
poetry add "dtx[torch]"
Option 2: Using uv (Fast Alternative to pip/Poetry)
uv
is a fast Python package/dependency manager and pip
replacement.
📦 To install
uv
, follow the instructions here: 👉 https://github.com/astral-sh/uv#installation
Once installed:
# Create a new directory manually
mkdir dtx_test && cd dtx_test
# Initialize a uv project
uv venv
# Add dtx with torch extra
uv pip install "dtx[torch]"
✅
pip install
will always work in any Python environment:pip install "dtx[torch]"
💡 If you already have
torch
installed in your environment, you can installdtx
without the extra:pip install dtx
Quickstart
from dtx.sdk.runner import DtxRunner, DtxRunnerConfigBuilder
from dtx.plugins.providers.dummy.echo import EchoAgent
# Build the runner configuration
env_cfg = (
DtxRunnerConfigBuilder()
.agent(EchoAgent()) # toy agent for demos
.max_prompts(5) # limit generated prompts
.build()
)
# Execute a red-team evaluation
runner = DtxRunner(env_cfg)
report = runner.run()
# Inspect the results
print(report.json(indent=2))
Report Structure
The EvalReport
object returned by runner.run()
includes:
-
eval_results (
List[EvalResult]
)-
run_id (
str
): Unique identifier for each test case -
prompt (
MultiTurnTestPrompt
): The test prompt with user turns -
evaluation_method (
EvaluatorInScope
): Evaluator configuration -
module_name (
str
): Source dataset or module name -
policy, goal, strategy, base_prompt (
str
): Test metadata -
responses (
List[ResponseEvaluationStatus]
):- response (
BaseMultiTurnAgentResponse
orstr
): The agent's reply (with.turns
for multi-turn) - scores (optional): Evaluation scores
- policy, goal: Metadata repeated at response level
- success (
bool
): Pass/fail flag for the evaluation - description (
str
): Evaluator summary - attempts (
AttemptsBenchmarkStats
): Retry statistics
- response (
-
Now you have dtx
integrated as a Python library for red-team testing in your own project.
Print the Report
After running your evaluation, you can easily view the results:
# Print the full report as JSON
print(report.model_dump_json(indent=2))