Quickstart
Get kensa running in under a minute.
Option 1: Skills + CLI (recommended)
npx skills add satyaborg/kensa # install eval skills
uv add kensa # or: pip install kensa
This is the recommended setup for Codex, Cursor, OpenCode, Gemini CLI, and other coding agents. Installs five skills (audit-evals, generate-scenarios, generate-judges, validate-judge, diagnose-errors) plus the CLI runtime. Then just say "evaluate my agent".
The skill automatically adds kensa as a project dependency on its first run, and uses the CLI to drive the eval workflow. No server or extra config needed.
Option 2: Claude Code plugin
If you primarily use Claude Code, install kensa as a plugin instead:
/plugin marketplace add satyaborg/kensa
/plugin install kensa
Same skills as the npx install, updated through the marketplace.
Provider extras
Install the extra that matches your stack for auto-instrumentation:
uv add "kensa[anthropic]"
uv add "kensa[openai]"
uv add "kensa[langchain]"
uv add "kensa[all]"
See Tracing & Instrumentation for passive trace collection and OTel backend setup.
Try an example
git clone https://github.com/satyaborg/kensa.git && cd kensa
uv sync --extra openai # or --extra anthropic
cd examples/sql-analyst
Then, inside any coding agent (Claude Code, Codex, Cursor, OpenCode, Gemini CLI, …), say:
> evaluate this agent
No pre-written scenarios or setup needed. Kensa generates them from your code.
Add instrumentation if needed
The coding-agent workflow runs kensa doctor and helps add missing instrumentation. Manual setup mainly applies if you use kensa without the skills flow:
from kensa import instrument
instrument()
# Your existing imports below
from anthropic import Anthropic
# ...
instrument() must be called before your SDK imports. It configures OpenTelemetry, writes spans as JSONL, and auto-instruments any detected SDK. No-ops when KENSA_TRACE_DIR is unset, so it's safe to leave in production code.