Advanced

Docs

GuideDescription
Eval Set FormatSchema, field reference, and examples for golden eval set JSON files
Custom EvaluatorsWrite your own scoring logic in Python, JavaScript, or any language
Live StreamingReal-time trace streaming, dev server setup, and session management
OpenTelemetry CompatibilitySupported OTel conventions, message delivery mechanisms, and OTLP receiver

REST API Reference

While the server is running (agentevals serve), interactive API documentation is available at:

EndpointDescription
/docsSwagger UI with interactive request builder
/redocReDoc reference documentation
/openapi.jsonRaw OpenAPI 3.x schema (for code generation or CI)

The OTLP receiver (port 4318) serves its own docs at http://localhost:4318/docs.

Development

uv run pytest                      # run tests
uv run agentevals serve --dev      # backend
cd ui && npm run dev               # frontend (separate terminal)

See DEVELOPMENT.md for build tiers, Makefile targets, and Nix setup. To contribute, see CONTRIBUTING.md.