Skip to content

enoshdev/KORA

 
 

KORA

Open-source execution control for AI workloads.

Most AI apps call the model too soon.

Every request becomes a prompt. Every prompt becomes tokens. Every token becomes latency, cost, and infrastructure pressure.

KORA turns AI requests into structured execution paths before inference: task graphs, deterministic-first execution, validation, telemetry, and model escalation only when needed.

KORA execution control overview

Before:

request -> prompt -> model -> output

After:

request -> task graph -> deterministic path -> validation -> model escalation -> telemetry

Structure first. Inference second.

3-Minute Local Run

For local development in this repository:

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -e ".[dev]"

Run the CLI and first offline demo:

python3 -m kora --help
python3 -m kora examples list
python3 -m kora run direct_vs_kora -- --offline

Inspect the output to see how KORA changes a direct model-first path into a controlled execution path.

What KORA Does

KORA sits between an AI request and a model call.

It helps developers:

  • turn requests into explicit task graphs
  • run deterministic work before inference
  • validate outputs before escalation
  • make model calls conditional instead of default
  • record telemetry around each execution path
  • compare direct model-first execution against controlled execution

KORA does not try to make models smarter. It controls when, why, and how they are used.

Current Alpha Evidence

KORA reduced model invocations by 80% in a reproducible deterministic-heavy benchmark workload.

KORA benchmark evidence card

This result is based on the current deterministic-heavy alpha benchmark and should not be interpreted as a universal production cost-reduction claim.

For methodology, counters, artifact policy, and reproduction commands, see:

What We Are Testing Next

  1. Runtime-integrated benchmark paths with real model calls
  2. Customer-support triage workloads
  3. RAG answer-routing workloads
  4. Agent budget-guard workloads

We are looking for early developers and AI app teams who want to test KORA against real workloads.

KORA validation roadmap

KORA validation roadmap.

See the KORA validation roadmap for the measurement plan.

See the real model-call validation design for the next measurement path.

KORA includes a local no-network validation path that measures model-call routing without requiring API keys or external providers.

Customer-support triage local validation is available as a no-network example.

Local no-network validation examples can generate Markdown reports with --report-md.

Local no-network validation examples support --adapter local_validation by default and explicit --adapter local_runtime for the deterministic in-process local runtime stub.

Reviewer packet: local no-network validation.

The reviewer packet includes the no-network baseline checklist, adapter-selection commands, fail-closed safety checks, and local Markdown report generation examples.

Local model adapter design: provider-neutral local runtime path.

Real provider adapter design-only packet: future provider boundary.

Real provider test harness design-only packet: dry-run provider validation contract.

Help Test KORA

Good candidate workloads:

  • customer-support triage
  • repetitive RAG workflows
  • agent workflows with budget or escalation rules
  • deterministic-heavy backend workflows
  • LLM apps with high repeated request patterns

To participate, follow the contact and discussion routes.

KORA help-test flow

How to help test KORA with a real workload.

See Help Test KORA for the workload submission template.

Documentation

Start with the KORA Documentation Index for the developer path:

  • Start
  • Understand
  • Run
  • Inspect evidence
  • Help test
  • Contribute

Useful entry points:

Install

Target package install path:

pip install kora

Homebrew install path:

brew install kora

For the current repository alpha, use the editable local install in the 3-Minute Local Run.

Current Examples

Current examples available in this repository:

  • examples/hello_kora
  • examples/direct_vs_kora
  • examples/retry_demo
  • examples/real_workload_harness
  • examples/stress_test
  • examples/runtime_integrated_benchmark

Use --offline for reproducible first-run paths without OpenAI credentials.

Alpha Scope

Included in the alpha surface:

  • execution-layer primitives for structured AI workloads
  • task graph and scheduler foundations
  • deterministic-first execution and verification components
  • telemetry summarization and reporting
  • repository examples covering direct-vs-structured execution, retries, stress behavior, and runtime evidence flow
  • terminal-first developer workflow

Not included in the alpha surface:

  • GUI-first product
  • chatbot interface
  • desktop AI app
  • model hosting or model serving engine
  • production cost-reduction proof
  • real API-cost reduction proof
  • energy reduction evidence

What KORA Is Not

KORA is not:

  • a chatbot
  • a desktop AI app (not yet)
  • a hosted chat-product alternative
  • a model serving engine
  • another agent wrapper that only forwards prompts to providers

KORA is a standalone open-source execution-control layer for AI workloads.

Contribute

Want to contribute? Start with:

Ecosystem

KORA is part of the broader Krako infrastructure.

Related repository:

  • Krako 2.0: TBD

License

Apache-2.0. See LICENSE.

About

An Inference Operating System that reduces unnecessary LLM calls by structuring intelligence before scaling it.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 92.0%
  • TypeScript 6.9%
  • Other 1.1%