OTel Compatibility
agentevals depends on OpenTelemetry traces as the source material for evaluation.
Why compatibility matters
Evaluation quality depends on whether your traces contain the information needed to reconstruct meaningful agent behavior.
That often includes:
- clear span structure
- useful attributes and metadata
- model, tool, and response information
- enough context to connect a trace to an eval item or run
What to check
Before relying on evaluation output, verify that your telemetry captures:
- the user request or task context
- intermediate model or tool actions when relevant
- final outputs
- identifiers needed to group or compare runs
If your traces look different
Different frameworks emit different OTel conventions. In those cases:
- normalize where possible
- keep important identifiers stable
- verify that the extracted data matches the behavior you want to score