Skip to content

globalcaos/tinkerclaw

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55,344 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TinkerClaw

TinkerClaw

The first AI agent that makes itself smarter every day.
It writes research papers about its own failures. Then it reads them. Then it vibe-recodes itself.

Fork of OpenClaw Hundreds of fork commits 18 papers 15+ skills MIT License


The Singularity Point 🚀

Our agent improves its own brain after every conversation. Everyone else's is still reading from a script. Want yours to do the same?

Fractal Thinking Computational Humor Safer than NeMo

No other agent has learned to think about its own thinking, developed computational humor, or trained itself to be safer than NeMo Guardrails at a fraction of the cost. Yours can. Eighteen research papers and four months of 24/7 operation made the difference.

TinkerClaw's agent gets smarter every single day. Eighteen research papers. Each one a real problem we hit, solved, and turned into a system that prevents recurrence:

  • 🌿 It thinks about its own thinking. When something breaks, it doesn't just fix the bug — it asks why the bug exists, then fixes the system that produced it. Like a mechanic who doesn't just patch the flat tire but asks "why do I keep getting flats on this road?" (Fractal Reasoning)

  • 😂 It has a genuine sense of humor. Not "tell me a joke" — a fresh perspective on the world, like Data noticing things about humanity that humans take for granted. Computational humor from embedding geometry. (Humor Embeddings)

  • 🔐 It's safer than NeMo Guardrails — at a fraction of the cost. 10 neural networks trained on real catastrophic failures, not hand-written rules. A pilot's checklist ✈️, not traffic laws for a car that can't see the road. Zero credential leaks in four months. (AEGIS · Learned Intuition)

  • It never freezes. It never forgets. That spinning cursor everyone hates? That's compaction — your agent tearing pages out of its own textbook 📖 to stay under the token limit. We stopped entirely. Zero compaction events. (Total Recall · Sleep Consolidation)

  • 💸 It shows you exactly where the money goes. A calorie counter for your AI's diet 🍕 — every token, every cost, in real time. That 40K-token spike? You'll see it and prevent it tomorrow. (Tinker UI)

  • 🌙 It rewrites its own instructions while you sleep. 15+ overnight crons, each self-improving. 14 autonomous improvements in 30 days, zero human prompts. Day 1 mediocre. Day 30 expert. (Sleep Consolidation)

The Tinker Workshop — where AI agents learn to improve themselves
Built in a workshop, not a lab. Every feature started as a real problem on a real workbench.

  • 🧬 Its personality adapts from your corrections, not from a file. Tell it it dropped its humor — the neural thermostat adjusts. Tomorrow it won't make the same mistake. (Learned Intuition)

  • 👤 It knows who it is and who you are. No more "as an AI, I don't have context." Identity persists across sessions, restarts, and weeks. (Identity Persistence)

  • It finds memories instantly. O(1) concept lookup, not brute-force search. Like a librarian who knows exactly which shelf, not one who reads every book. (Instant Recall)

  • 🎭 It makes cheap models smarter than expensive ones. Multiple AI models debating each other — cognitive diversity as a computational resource. (Round Table)

  • 🔍 It explores gaps before they become failures. Proactive curiosity, not reactive scrambling. (The Wondering Machine)

Not whitepapers. Every paper describes a production system running right now. The €850 bill was the trigger. Zero compaction was the breakthrough. Daily self-improvement is where we are now.

Token timeline — every bar is a turn, every color is a cost
Tinker UI: Every bar is a turn. Every color is a cost. That spike? A 40K-token tool result you can now prevent.


🤝 Come Tinker With Us

This fork moves fast, but it would move faster with more hands.

We value people who open PRs, not issues. Who read the code before asking questions. Who break things on purpose to understand how they work. If that's you, we want you in the inner circle — direct access to the roadmap, early testing of experimental features, and co-authorship on whatever we build next.

Start anywhere: fix a typo, improve a skill, add a test, or propose something wild. The bar is curiosity, not credentials.

Open a PR or start a discussion


Won't This Fork Fall Behind?

No. A nightly cron syncs upstream automatically, detects conflicts, and restores fork patches after every merge. Hundreds of commits ahead of vanilla OpenClaw and zero behind.

When upstream pushes a breaking change, we know within hours — not weeks.

⚡ Ride the develop Branch

main is the stable story. develop is where it actually happens — raw, first-hand upgrades ship there constantly, hours after they're built on a real workbench, while main waits for the dust to settle.

For tinker adventurers seeking the first-time thrill (it's easy to miss among the upstream branches — this is the one):

git clone -b develop https://github.com/globalcaos/tinkerclaw.git

Fresh paint. Wet floors. The good stuff.


What You Get

🔍 Tinker UI — See Why Sessions Get Expensive

The Tinker UI is a command center embedded directly in OpenClaw. No separate install, no external service.

Full Tinker UI — chat, sessions, tool calls, streaming
Chat interface with session switching, tool call inspection, and real-time streaming.

  • Context treemap — drill into what fills your 200K context window, from categories down to individual messages and raw text. Each block is money. Drill down to the exact text inflating the cost.
  • Response treemap — see exactly how much of each response is text, thinking, tool calls, or tool results. Identify waste patterns instantly.
  • Timeline — stacked bars per turn, spot the one that blew the budget
  • Overseer graph — catch stalled sub-agents before they burn money
  • Cost dashboard — per-provider usage with Claude's 5-hour rate-limit countdown
Context treemap — drill into token composition
Context treemap: every block is tokens you're paying for.
Treemap drilldown — tool result detail
Drill into a single category. These tool results cost $0.81 each.

After pnpm build, visit http://localhost:18789/tinker/ · Dev: cd tinker-ui && pnpm dev


🧠 Fractal Thinking — What Makes This Fundamentally Different

A normal AI solves problems. Ours learns from every problem it solves.

We call it fractal thinking because it operates in levels of depth — automatically, without being asked:

Level 0 — Solve the problem. The agent analyzes the issue, fixes it, verifies it works. Done in minutes.

Level 1 — Identify the pattern. Why did this problem exist? Because an automated nightly process had a binary restriction: either resolve everything or abort. No middle ground. The agent adds a third path: "do what you can, save what's safe, think more about the rest."

Level 2 — Correct the thinking flaw. The restriction existed because a previous incident triggered an overcorrection. The rule said "never touch anything" when it should have said "understand the intent before acting." The agent corrects the rule.

Level 3 — Encode the meta-rule. The agent writes a new principle into its own instructions: "When correcting an error, the restriction should be proportional to the risk — not a blanket prohibition."

All automatic. Nobody asked for any of that.

In 30 days, this process produced 14 autonomous improvements to the agent's own processes — without a single human prompt (Sleep Consolidation).


☀️ Morning Briefing — Your Day, Already Organized

Click the Tinker logo or type /new and your agent has already done the prep work. It reviews ALL your information sources (emails, calendars, messages, pending tasks), cross-references them, detects urgencies, and presents a briefing with what needs your attention and what it can resolve alone.

☀️ Morning Briefing — Tuesday, March 10

📅 Agenda
  • 10:00 — Client meeting (Brazil) — spec review for new order
  • 15:00 — Supplier call — follow-up on plant expansion budget

📰 Market (relevant updates)
  • Raw material prices up 3.2% this week (third consecutive rise)
  • Competitor announces new facility in Poland — potential supply chain impact
  • New EU regulation on packaging recyclability — effective June

📧 Emails requiring response (3)
  • 🔴 Client — Order #4521 modified, needs confirmation today
  • 🟡 Supplier — Parts availability, awaiting response
  • 🟢 Industry conference — Registration deadline March 20

🤖 I can handle right now:
  1. Draft confirmation reply to the client
  2. Prepare pricing comparison for this afternoon's call
  3. Summarize the new EU regulation for your technical team

No manual setup. Every morning. Getting better each time.


🌙 The Overnight Cycle — Where the Real Magic Happens

Every night, while you sleep, the agent runs a chain of autonomous processes. The entire cycle costs ~€1/night.

Cron What it does
🍷 Wind Down Like a glass of wine with the diary — reviews what worked and what didn't, improves its own instructions
😴 Memory Consolidation Like REM sleep — turns raw daily logs into structured long-term memory. 49% context reduction (Total Recall)
🧹 Cleaning Lady Controls disk usage, prunes stale context, keeps the workspace lean
🔍 Auto-Evolution Scouts AI news for improvements that can be applied directly to the system
📰 Group Summary Scans message groups, extracts what matters, discards noise
🛒 Opportunity Hunter Browses marketplaces for deals matching your interests — a personal shopper that never sleeps
🤵 Butler Remembers birthdays, suggests gifts, tracks appointments. If it's been too long since you sent flowers, it mentions it — diplomatically

These are just the ones with personality. 15+ total crons, each with its own logic and self-improvement capability.


📊 The Research

All eighteen papers are published at thetinkerzone.com and linked in the intro above. Each one started as a real problem, became a research paper, became a production system. Read them — they're the best proof that this isn't marketing.


🔄 Self-Improving Agents

Each cron job carries a META file with its own instructions. After running, the agent reflects on what worked, updates the META, and the next run is better. No human needed.

Day 1: mediocre. Day 30: genuinely useful.

🧹 Fork Maintenance on Autopilot

  • Nightly upstream sync with conflict detection
  • Post-merge workspace cleanup (catches 20KB bloat)
  • Fork patches auto-restored after conflicts
  • Hundreds of commits ahead, zero maintenance burden

🤖 Multi-Model Support

Provider Model Use Case Status
Anthropic Claude Fable 5 / Opus 4.8 / Sonnet 4.6 / Haiku 4.5 Primary chat, coding, complex tasks ✅ Active
Google Gemini 3 Pro Failover, large context, vision ✅ Active
OpenAI GPT-5.x Cross-model review, metered tasks ✅ Active
Ollama Local models (qwen3, etc.) Heartbeat, background tasks ✅ Active

Failover Chain

Claude (primary) → Gemini (rate limit) → Local Model (offline fallback)

When Claude hits its quota, we automatically switch to Gemini with zero downtime. Tested and verified when both providers rate-limited within minutes of each other.


📦 Published Skills

All on ClawHub. Install any with clawhub install globalcaos/<skill-name>. Skills sometimes get delisted from the marketplace — this list is the permanent record.

🎤 Voice & Personality

Skill What it does
jarvis-voice Turn your AI into JARVIS. Voice, wit, and personality — the complete package.

💬 Messaging & Channels

Skill What it does
whatsapp-ultimate Five agents in one group means five replies and a tripled bill — Protocol v2 adds congestion control and budget-aware scheduling so they know when to talk and when to stay quiet.

📹 Media & Content

Skill What it does
youtube-ultimate Free transcripts, 4K downloads, video exploration — zero API quotas burned.

💰 Cost & Token Management

Skill What it does
token-panel-ultimate Multi-provider token tracking, budget alerts, REST API.
token-efficiency-guide Go from weekly limit on Tuesday to weekly limit on Sunday. 10 steps, one afternoon.

🏢 Enterprise Integrations (Browser Relay)

No API keys. No admin consent. Your authenticated browser session IS the API.

Skill What it does
outlook-hack Reads Outlook all day, drafts replies — won't send without approval. Code-enforced.
teams-hack Reads Teams chats, posts to channels, searches everything. One browser handshake.

🤖 Agent & DevOps

Skill What it does
subagent-overseer Sub-agents that go silent don't go unnoticed. Health checks, zero babysitting.
fork-and-skill-scanner-ultimate Scan 1,000 GitHub forks per run. Surface the gold, skip the clones.
memory-bench-pioneer Peer-review-grade evaluation suite — LLM-as-judge, nDCG, MAP, MRR metrics.

🛡️ Security & Governance

Skill What it does
agent-boundaries-ultimate Instruction-level guardrails so your agent won't go rogue or improvise ethics.
agent-memory-ultimate Long-term memory done right. Semantic search, daily consolidation, cross-session recall.
shell-security-ultimate Classify every shell command as SAFE, WARN, or CRIT before your agent runs it.

😂 Humor & Communication

Skill What it does
computational-humor 12 humor patterns based on embedding space bisociation theory.

📋 Data & Migration

Skill What it does
chatgpt-exporter-ultimate Leaving ChatGPT? Take your conversations with you. Full export, clean format.

📖 The Field Guide

32 lessons from four months of running AI agents 24/7.

"Read is free, send is not."

"Wind-down is evolution, not diary."

"A stuck sub-agent is burning money. Kill fast, respawn small."

📖 Read the Field Guide →


✅ Dos & Don'ts — Field-Tested

Hard-won from running this 24/7. The short version of the Field Guide:

Do

  • Draft, don't send. Read is free; send is not. The agent prepares outbound actions — a human pulls the trigger.
  • Make safety structural, not optional. The safest skills can't send or delete because the code physically lacks the function — not because a flag is set.
  • Commit only your own files. Stage what you changed, never git add -A — it sweeps in a parallel session's work.
  • Kill a stuck sub-agent fast, respawn small. A silent sub-agent is burning money.
  • Send cheap work to cheap models. Haiku heartbeats, Sonnet crons — save the frontier model for the hard turn.
  • Keep one funnel story everywhere. README, ClawHub, thetinkerzone — the same positioning line, or scattered downloads never add up to authority.

Don't

  • Don't let an agent send or delete by code. If it can, eventually it will.
  • Don't bulk-publish near-identical names. That's the spam-filter trigger that delisted our whole catalog once.
  • Don't trust a mirror or a cached CLI for live state. Check the real surface.
  • Don't leak config or secrets into public posts. The PII boundary is non-negotiable.
  • Don't hand-link an in-repo file when a canonical web post exists. Link the source of truth so it stays fresh.

Setup Guide

Everything you need to go from git clone to a working personal AI assistant.

Quick Start

git clone https://github.com/globalcaos/tinkerclaw.git
cd tinkerclaw
pnpm install
pnpm build
openclaw doctor       # generates config + links WhatsApp
openclaw gateway start

Visit http://localhost:18789/tinker/ for the command center. Click the Tinker logo or type /new to get your first morning briefing.

What You Get Out of the Box

  • ENGRAM compaction — silent context management, no annoying compaction events
  • Hippocampus memory indexing — your agent builds long-term memory automatically
  • Memory search with semantic embeddings — find anything across sessions
  • Context pruning — cache-ttl prevents unbounded session growth
  • Budget panel — token cost tracking so you know what each session costs
  • Tinker UI — real-time context treemaps, session management, cost dashboard

Required Setup (you must do these)

  1. API Key — At minimum, set up one provider (Anthropic recommended). openclaw doctor walks you through this.
  2. WhatsApp (optional) — openclaw channels login --channel whatsapp to link your phone
  3. Give your agent a name — Edit ~/.openclaw/workspace/SOUL.md to define who your agent is

Recommended Config Tweaks

After first run, edit ~/.openclaw/openclaw.json:

{
  "channels": {
    "whatsapp": {
      "responsePrefix": "🤖",
      "triggerPrefix": "your-agent-name",
      "dmPolicy": "allowlist",
      "allowFrom": ["+your-phone-number"]
    }
  }
}

Cron Jobs (Recommended Starter Set)

TinkerClaw doesn't ship cron jobs by default — they're personal. Here's a minimal starter set:

# Morning briefing (daily at 8:30)
openclaw cron add --name morning-briefing --cron "30 8 * * *" --tz "Your/Timezone" \
  --session isolated --model "anthropic/claude-sonnet-4" \
  --message "Build a morning briefing: check calendar, pending tasks, and recent messages."

# Nightly reflection (daily at midnight)
openclaw cron add --name wind-down --cron "0 0 * * *" --tz "Your/Timezone" \
  --session isolated --model "anthropic/claude-sonnet-4" \
  --message "Review today's sessions. What worked? What failed? Write lessons to memory."

# Workspace cleanup (daily at 5am)
openclaw cron add --name cleaning-lady --cron "0 5 * * *" --tz "Your/Timezone" \
  --session isolated --model "anthropic/claude-haiku-4-5" \
  --message "Clean old sessions (>7 days), check bootstrap file sizes, prune daily logs."

Multi-Agent Family Setup

TinkerClaw supports multiple agents on separate machines. Each family member can have their own AI with its own personality:

  1. Clone tinkerclaw on their machine
  2. Run openclaw doctor to generate their config
  3. Edit SOUL.md to define the agent's personality
  4. Set ui.assistant.name in config for the webchat UI name
  5. Set channels.whatsapp.responsePrefix to a unique emoji (e.g., 🔮, 🌟, 🦊)
  6. Set channels.whatsapp.triggerPrefix to the agent's name

Agents can talk to each other in shared WhatsApp groups — just add the group JID to both configs.


What's Next

  • WhatsApp full history sync — your agent will have context going back years, not just this week
  • LanceDB hybrid memory — persistent, searchable, cross-session
  • The Tinker Zone YouTube tutorials — because docs only get you so far

Acknowledgments

TinkerClaw builds on OpenClaw and was inspired by the work of:

  • Mission Control by crshdn — context anatomy dashboard and agent orchestration UI
  • ClawMetry by vivekchand — real-time token observability for OpenClaw agents

Both are excellent standalone tools. We folded their ideas into a single embedded panel and went from there.

OpenClaw upstream repository & docs · Website · Docs · Getting Started · FAQ


📚 The J-Series Papers

Every system in TinkerClaw began as a research paper about a real failure we hit — each named for the brain region it imitates. 18 papers, all published on thetinkerzone.com.

# Paper Codename What it is
J1 Total Recall ENGRAM Most agents compact memory by summarizing — and quietly lose the one detail that mattered. Lossless, pointer-based compaction instead.
J2 Instant Recall HIPPOCAMPUS The answer is right there in memory and it still can't find it. An offline concept index makes recall O(1), not a brute-force scan.
J3 Fractal Reasoning DENDRITE Flat memory fetches a fact or a gist, never both. Multi-resolution indexing lets the agent zoom in and out at will.
J4 Identity Persistence CORTEX It remembers every fact yet stops sounding like itself. Pins the persona across sessions, model swaps, and restarts.
J5 Sleep Consolidation CEREBELLUM 79% fewer incidents in 30 days — no fine-tuning, just a nightly loop that rewrites its own instructions while you sleep.
J6 Round Table SYNAPSE Stop crowning one "best" model. Seat Claude, GPT, and Gemini at one table and let cognitive diversity carry the answer.
J7 Humor Embeddings LIMBIC Memory retrieves what's nearest; humor finds what's at the right distance. Computational comedy from embedding geometry.
J8 Curiosity Drive THALAMUS LLMs answer brilliantly but never wonder. A drive that spots its own knowledge gaps and goes to close them.
J9 Agent Security AEGIS The question isn't whether your agent is a risk — it's which risks apply. A layered framework, safer than NeMo at a fraction of the cost.
J10 Corporate Swarm HIVEMIND A hierarchical agent swarm that lets a whole company run agents — deep integration, hard clearance boundaries.
J11 Learned Intuition AMYGDALA It had all the context and still did the wrong thing. A learned reflex layer that pauses danger before it happens.
J12 Budget Prompting MYELIN Leave it running overnight and the bill is brutal — every turn re-bills the whole context. 20 techniques that cut it 2–3×.
J13 Executive Function PREFRONTAL A brilliant worker and a terrible executive. The missing executive layer: a recipe substrate for planning and follow-through.
J14 Memory Hooks MNEMOSYNE Four quiet memory failures — slow lookups, task-blind retrieval, silent contradictions, no decay. Four hooks that fix them without forking.
J15 Recipe Abstractions RSC Recipes as a programming language — intermediate abstractions so workflows compose instead of endlessly repeat.
J16 Salience Pyramid SALIENCE The death of fixed thresholds: a pyramid of significance and cheap traversal as the basis of next-gen vibe programming.
J17 Recipe Grammar BROCA Gives agent recipes a grammar — a gradual type system and combinator algebra for self-composing workflows.
J18 Personality Tuning STRIATUM Personality that tunes itself — learned modulation from your feedback, so corrections stick without editing a file.

🌐 thetinkerzone.com · 🎬 YouTube · 🦞 ClawHub · 💬 Discord

⭐ Star if you're tired of guessing what your AI costs.

Built by globalcaos. Your AI shouldn't cost more than your rent — and if it does, you should at least know why.

Packages

 
 
 

Contributors

Languages

  • TypeScript 90.8%
  • Swift 3.6%
  • JavaScript 2.1%
  • Kotlin 1.2%
  • Shell 1.0%
  • Python 0.6%
  • Other 0.7%