Skip to content

Phoenix Adapter

The Phoenix adapter converts AgentV eval YAML suites into Phoenix dataset and experiment payloads. Use it when your team already reviews experiments in Phoenix but wants AgentV eval files, graders, result JSONL, and run artifacts to remain the canonical source.

The adapter is intentionally narrow. It supports deterministic assertions that map cleanly to Phoenix CODE evaluators and reports unsupported AgentV families instead of silently dropping semantics.

From the AgentV repository root:

Terminal window
bun --filter @agentv/phoenix-adapter phoenix:assert-smoke

This runs a dry-run smoke conversion for the deterministic assertion example and writes a structural report to /tmp/agentv-phoenix-assert-smoke.json.

Run a broader dry run:

Terminal window
bun --filter @agentv/phoenix-adapter phoenix:dry-run

Run one eval source directly:

Terminal window
bun packages/phoenix-adapter/src/cli.ts run \
--dry-run \
--agentv-root . \
--eval-file examples/features/assert/evals/dataset.eval.yaml \
--out reports/phoenix-assert.json
AgentV assertion familyPhoenix adapter behavior
containsConverts to deterministic Phoenix evaluator logic
regexConverts to deterministic Phoenix evaluator logic
equalsConverts to deterministic Phoenix evaluator logic
is-jsonConverts to deterministic Phoenix evaluator logic
llm-grader, rubrics, code-grader, tool-trajectory, composite, metrics, and custom familiesReported as unsupported in the adapter report

Unsupported families do not fail conversion by default. Add --fail-on-unsupported when a parity report should fail CI if any suite needs a manual Phoenix-specific evaluator.

Terminal window
bun packages/phoenix-adapter/src/cli.ts run \
--dry-run \
--agentv-root . \
--filter examples/features/assert \
--fail-on-unsupported

Use the Phoenix adapter for:

  • deterministic assertion suites that should appear as Phoenix datasets and experiments
  • parity checks that prove Phoenix row IDs match AgentV test IDs
  • integration smoke tests before writing a custom Phoenix evaluator

Keep the eval in AgentV when you need:

  • workspace setup, lifecycle hooks, Docker workspaces, or repo materialization
  • code graders that execute commands in the AgentV workspace
  • tool trajectory, trace, cost, latency, or composite scoring
  • rich rubric semantics that need AgentV’s assertion objects in result JSONL

Those features can still be represented in Phoenix with custom task and evaluator code, but the adapter does not attempt a lossy automatic conversion.

The Phoenix adapter creates dataset and experiment payloads. It is separate from AgentV’s OpenTelemetry trace export.

For trace export, use AgentV’s standard OTel options:

Terminal window
agentv eval evals/my-eval.yaml --otel-file traces/eval.otlp.json

For live OTel export to a configured backend, use the options documented in Running Evaluations.

The adapter package includes the implementation README, support matrix, and verification notes: