Reporters
Output evaluation results to the console, JSON files, CSV files, or CI-friendly format with automatic failure exit codes.
Overview
Reporters format and output eval suite results. Call result.report() with one or more reporter types after a run completes.
const result = await suite.run();
result.report('console');
result.report('json', { path: './reports/eval.json' });
result.report(['console', 'json', 'csv'], { path: './reports/eval' });You can also use the standalone report() function:
import { report } from '@cogitator-ai/evals';
report(resultData, 'console');
report(resultData, ['json', 'csv'], { path: './eval-report' });Console Reporter
Prints a formatted table of aggregated metrics, assertion results, and a summary line to stdout.
result.report('console');Example output:
Metric Mean Median P95 Min Max
──────────────────────────────────────────────────────────
exactMatch 0.9200 1.0000 1.0000 0.0000 1.0000
contains 0.9600 1.0000 1.0000 0.0000 1.0000
Assertions
✓ threshold(exactMatch) exactMatch = 0.92 >= 0.9
✗ threshold(contains) contains = 0.96, expected >= 0.98
Summary: 50 cases | 12340ms | $0.52 | 1 passed 1 failedAssertions show green check marks for passes and red crosses for failures.
JSON Reporter
Writes the full result data to a JSON file. Includes per-case results with scores, aggregated metrics, assertions, and stats.
result.report('json', { path: './eval-report.json' });Default path: eval-report.json
The JSON output includes:
{
"results": [
{
"case": { "input": "...", "expected": "..." },
"output": "...",
"duration": 1234,
"scores": [
{ "name": "exactMatch", "score": 1 },
{ "name": "contains", "score": 1 }
]
}
],
"aggregated": {
"exactMatch": { "name": "exactMatch", "mean": 0.92, "median": 1, "p95": 1, ... }
},
"assertions": [
{ "name": "threshold(exactMatch)", "passed": true, "message": "..." }
],
"stats": { "total": 50, "duration": 12340, "cost": 0.52 }
}CSV Reporter
Writes per-case results to a CSV file with one row per eval case. Metric scores are included as additional columns.
result.report('csv', { path: './eval-report.csv' });Default path: eval-report.csv
Output format:
input,expected,output,duration,exactMatch,contains
What is 2+2?,4,4,89,1,1
Capital of France?,Paris,paris,124,0,1Fields containing commas, quotes, or newlines are properly escaped.
CI Reporter
Minimal output designed for CI pipelines. Prints one line per assertion and exits with code 1 if any assertion fails.
result.report('ci');Example output:
Eval: 50 cases | 12340ms | $0.52
PASS threshold(exactMatch)
FAIL threshold(contains): contains = 0.96, expected >= 0.98
Result: 1 passed, 1 failedThe process exits with code 1 when failures are detected, which causes CI jobs to fail automatically. Use this in your test scripts:
{
"scripts": {
"eval": "tsx eval/run.ts",
"eval:ci": "tsx eval/run-ci.ts"
}
}// eval/run-ci.ts
const result = await suite.run();
result.report('ci');Multiple Reporters
Pass an array of reporter types to output in multiple formats at once:
result.report(['console', 'json', 'csv'], { path: './reports/eval' });The path option is shared across file-based reporters. JSON appends .json and CSV appends .csv to the base path automatically if needed.
CI Integration Example
import {
Dataset,
EvalSuite,
exactMatch,
contains,
latency,
threshold,
noRegression,
} from '@cogitator-ai/evals';
const dataset = await Dataset.fromJsonl('./eval/dataset.jsonl');
const suite = new EvalSuite({
dataset,
target: { fn: myAgentFn },
metrics: [exactMatch(), contains()],
statisticalMetrics: [latency()],
assertions: [
threshold('exactMatch', 0.9),
threshold('contains', 0.95),
threshold('latency.p95', 5000),
noRegression('./eval/baseline.json'),
],
});
const result = await suite.run();
result.report(['console', 'json', 'ci'], { path: './eval/report' });