Open-source execution control for AI workloads.
Most AI apps call the model too soon.
Every request becomes a prompt. Every prompt becomes tokens. Every token becomes latency, cost, and infrastructure pressure.
KORA turns AI requests into structured execution paths before inference: task graphs, deterministic-first execution, validation, telemetry, and model escalation only when needed.
Before:
request -> prompt -> model -> output
After:
request -> task graph -> deterministic path -> validation -> model escalation -> telemetry
Structure first. Inference second.
For local development in this repository:
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -e ".[dev]"Run the CLI and first offline demo:
python3 -m kora --help
python3 -m kora examples list
python3 -m kora run direct_vs_kora -- --offlineInspect the output to see how KORA changes a direct model-first path into a controlled execution path.
KORA sits between an AI request and a model call.
It helps developers:
- turn requests into explicit task graphs
- run deterministic work before inference
- validate outputs before escalation
- make model calls conditional instead of default
- record telemetry around each execution path
- compare direct model-first execution against controlled execution
KORA does not try to make models smarter. It controls when, why, and how they are used.
KORA reduced model invocations by 80% in a reproducible deterministic-heavy benchmark workload.
This result is based on the current deterministic-heavy alpha benchmark and should not be interpreted as a universal production cost-reduction claim.
For methodology, counters, artifact policy, and reproduction commands, see:
- Runtime evidence reviewer guide
- Benchmark artifact policy
- Benchmark result summary
- Claim registry
- Validation roadmap
- Runtime-integrated benchmark paths with real model calls
- Customer-support triage workloads
- RAG answer-routing workloads
- Agent budget-guard workloads
We are looking for early developers and AI app teams who want to test KORA against real workloads.
KORA validation roadmap.
See the KORA validation roadmap for the measurement plan.
See the real model-call validation design for the next measurement path.
KORA includes a local no-network validation path that measures model-call routing without requiring API keys or external providers.
Customer-support triage local validation is available as a no-network example.
Local no-network validation examples can generate Markdown reports with --report-md.
Local no-network validation examples support --adapter local_validation by default and explicit --adapter local_runtime for the deterministic in-process local runtime stub.
Reviewer packet: local no-network validation.
The reviewer packet includes the no-network baseline checklist, adapter-selection commands, fail-closed safety checks, and local Markdown report generation examples.
Local model adapter design: provider-neutral local runtime path.
Real provider adapter design-only packet: future provider boundary.
Real provider test harness design-only packet: dry-run provider validation contract.
Good candidate workloads:
- customer-support triage
- repetitive RAG workflows
- agent workflows with budget or escalation rules
- deterministic-heavy backend workflows
- LLM apps with high repeated request patterns
To participate, follow the contact and discussion routes.
How to help test KORA with a real workload.
See Help Test KORA for the workload submission template.
Start with the KORA Documentation Index for the developer path:
- Start
- Understand
- Run
- Inspect evidence
- Help test
- Contribute
Useful entry points:
- Examples directory
- Telemetry and observability counters
- Public language guide
- Contact and discussion routes
- Community manager guide
- Contributing guide
Target package install path:
pip install koraHomebrew install path:
brew install koraFor the current repository alpha, use the editable local install in the 3-Minute Local Run.
Current examples available in this repository:
examples/hello_koraexamples/direct_vs_koraexamples/retry_demoexamples/real_workload_harnessexamples/stress_testexamples/runtime_integrated_benchmark
Use --offline for reproducible first-run paths without OpenAI credentials.
Included in the alpha surface:
- execution-layer primitives for structured AI workloads
- task graph and scheduler foundations
- deterministic-first execution and verification components
- telemetry summarization and reporting
- repository examples covering direct-vs-structured execution, retries, stress behavior, and runtime evidence flow
- terminal-first developer workflow
Not included in the alpha surface:
- GUI-first product
- chatbot interface
- desktop AI app
- model hosting or model serving engine
- production cost-reduction proof
- real API-cost reduction proof
- energy reduction evidence
KORA is not:
- a chatbot
- a desktop AI app (not yet)
- a hosted chat-product alternative
- a model serving engine
- another agent wrapper that only forwards prompts to providers
KORA is a standalone open-source execution-control layer for AI workloads.
Want to contribute? Start with:
- Contributor pathway
- Contact and discussion routes
- CONTRIBUTING.md
- Good first issue candidates
- SECURITY.md
- GOVERNANCE.md
- CODE_OF_CONDUCT.md
KORA is part of the broader Krako infrastructure.
Related repository:
- Krako 2.0: TBD
Apache-2.0. See LICENSE.



