Documentation Index
Fetch the complete documentation index at: https://kensa.sh/docs/llms.txt
Use this file to discover all available pages before exploring further.
Start with deterministic checks in CI even if you do not want to spend judge tokens on every push. Add judge keys only when you want natural-language gating.
GitHub Actions
.github/workflows/eval.yml
0 = pipeline ran end-to-end, 1 = kensa itself errored (config, missing keys, scenario load failure). Failed scenarios do not change the exit code, gate on the report output instead.
What needs API keys
Deterministic checks need no API keys. They run entirely locally. Judge criteria need an API key for the LLM provider. If any scenario setscriteria or judge and no API key is available, kensa eval exits 1. Either add a provider key as a secret, or remove judge criteria from the scenarios you run in CI.
This means you can run cost, latency, tool ordering, and output matching checks in CI for free, and add LLM judging only when keys are wired up.
Output formats
| Format | Flag | Use case |
|---|---|---|
| Terminal | (default) | Local development |
| Markdown | --format markdown | PR comments, CI logs |
| JSON | --format json | Machine-readable, dashboards |
kensa eval also writes a standalone HTML report to .kensa/reports/{run_id}.html automatically on every run; upload it as a CI artifact for a shareable view.
PR comment integration
Pipe markdown output to a PR comment:PR comment step