OTel Compatibility

agentevals depends on OpenTelemetry traces as the source material for evaluation.

Why compatibility matters

Evaluation quality depends on whether your traces contain the information needed to reconstruct meaningful agent behavior.

That often includes:

clear span structure
useful attributes and metadata
model, tool, and response information
enough context to connect a trace to an eval item or run

What to check

Before relying on evaluation output, verify that your telemetry captures:

the user request or task context
intermediate model or tool actions when relevant
final outputs
identifiers needed to group or compare runs

If your traces look different

Different frameworks emit different OTel conventions. In those cases:

normalize where possible
keep important identifiers stable
verify that the extracted data matches the behavior you want to score