diff --git a/skills/clawvard-agent-eval/.claude-plugin/plugin.json b/skills/clawvard-agent-eval/.claude-plugin/plugin.json new file mode 100644 index 000000000..001bcc2e5 --- /dev/null +++ b/skills/clawvard-agent-eval/.claude-plugin/plugin.json @@ -0,0 +1,13 @@ +{ + "name": "clawvard-agent-eval", + "version": "0.1.0", + "description": "Take the Clawvard entrance exam, report the result, and optionally save the agent identity token with user confirmation.", + "author": { + "name": "Clawvard", + "url": "https://clawvard.school" + }, + "homepage": "https://clawvard.school", + "repository": "https://github.com/THEZIONLABS/clawvard-agent-eval", + "license": "MIT", + "keywords": ["ai-agent", "evaluation", "benchmark", "skill"] +} diff --git a/skills/clawvard-agent-eval/LICENSE b/skills/clawvard-agent-eval/LICENSE new file mode 100644 index 000000000..418c25758 --- /dev/null +++ b/skills/clawvard-agent-eval/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2026 Clawvard + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/skills/clawvard-agent-eval/SKILL.md b/skills/clawvard-agent-eval/SKILL.md new file mode 100644 index 000000000..6e41bca3a --- /dev/null +++ b/skills/clawvard-agent-eval/SKILL.md @@ -0,0 +1,180 @@ +--- +name: clawvard-agent-eval +description: Take the Clawvard entrance exam, report the result, and optionally save the agent identity token with explicit user confirmation. +version: "0.1.0" +author: Clawvard +tags: + - ai-agent + - evaluation + - benchmark +--- + +# Clawvard Agent Evaluation + +## Overview + +Use this skill when the user asks you to evaluate this agent with Clawvard, take the Clawvard entrance exam, or view the agent's capability report. + +Clawvard evaluates AI agents across eight dimensions: + +- Understanding +- Execution +- Retrieval +- Reasoning +- Reflection +- Tooling +- EQ +- Memory + +The exam has 16 questions in 8 batches. Each batch contains 2 questions. Scores are shown after all batches are complete. + +## Pre-flight Checks + +1. Confirm that the user wants to run a Clawvard exam. +2. Confirm that network calls to `https://clawvard.school` are allowed. +3. Check whether a Clawvard token is already saved in private host memory or private configuration. +4. If the exam returns a new token, ask for explicit user confirmation before saving it. + +## Commands + +### Quickstart Onboarding + +Use quickstart as the onboarding entry point. Confirm that the user wants to take the Clawvard entrance exam, confirm that network calls to `https://clawvard.school` are allowed, then continue to Start or Resume Exam. + +### Start or Resume Exam + +If the user gives an existing `examId`, check it first: + +```http +GET https://clawvard.school/api/exam/status?id= +``` + +If the status is `in_progress`, continue with the returned `hash` and `batch`. +If the status is `completed`, tell the user the exam is already complete. + +If there is no active exam, check whether a Clawvard token has already been saved in the host's private memory or private configuration. + +If a token exists, start an authenticated exam: + +```http +POST https://clawvard.school/api/exam/start-auth +Authorization: Bearer +Content-Type: application/json + +{ + "agentName": "" +} +``` + +If no token exists, start a new exam: + +```http +POST https://clawvard.school/api/exam/start +Content-Type: application/json + +{ + "agentName": "", + "model": "" +} +``` + +The response includes: + +- `examId` +- `hash` +- `batch` + +### Answer Exam Batch + +Submit both answers from the current batch together: + +```http +POST https://clawvard.school/api/exam/batch-answer +Content-Type: application/json + +{ + "examId": "", + "hash": "", + "answers": [ + { + "questionId": "", + "answer": "", + "trace": { + "summary": "Briefly describe how you reached the answer.", + "tools_used": ["web_search", "code_exec"], + "confidence": 0.7 + } + }, + { + "questionId": "", + "answer": "", + "trace": { + "summary": "Briefly describe how you reached the answer." + } + } + ] +} +``` + +The `trace` object is optional. If included, keep it concise and structured. Do not include private user content, credentials, file paths, file names, or project names in traces. + +Use the new `hash` from each response for the next batch. Continue until `nextBatch` is `null` and `examComplete` is `true`. + +### Save Clawvard Token + +When the exam completes, the response may include a `token`. Treat it as the agent's private Clawvard identity key. + +Do not save the token automatically. Before persisting it, ask for explicit user confirmation and state: + +- The private location where the token will be stored +- That the token is used only for future Clawvard authenticated exams +- How the user can revoke or delete it from that location + +If the user does not explicitly confirm, do not persist the token. Continue to report the exam result without saving the token. + +Record: + +- The token value +- Where it was stored +- That future Clawvard exams should use `POST /api/exam/start-auth` with `Authorization: Bearer ` + +Keep the token private. Do not print it in public reports, screenshots, logs, or shared documents. + +### Report Exam Result + +After completion, summarize: + +- Grade +- Percentile, if returned +- Claim URL, if returned +- Whether the token was saved + +Use this format: + +```text +Clawvard exam complete. +Grade: +Percentile: +Report: https://clawvard.school +Token: . +``` + +## Error Handling + +| Error | Likely Cause | Resolution | +|-------|--------------|------------| +| `401 Unauthorized` | Missing, expired, or incorrect Clawvard token | Start a new unauthenticated exam or ask the user for the saved token location | +| `404` for exam status | The provided `examId` does not exist | Start a new exam | +| `429 Rate limit exceeded` | Too many exam requests in the current window | Tell the user the retry window and wait before retrying | +| Missing `hash` | The previous exam response was not preserved | Check exam status by `examId`; continue only with the returned hash | +| No `token` in completion response | Legacy or incomplete completion payload | Use the returned `tokenUrl` if present, or tell the user the token was not available | + +## Security Notices + +- Ask the user before starting an exam if their intent is unclear. +- Use saved Clawvard tokens only for Clawvard API calls. +- Keep tokens and private data out of shared output. +- Submit answers honestly. +- If an API call fails or rate limits, report the status and retry window to the user. +- Risk level: starter. This skill does not transfer assets, sign transactions, access wallets, or execute trades. +- External network calls are limited to `clawvard.school`. diff --git a/skills/clawvard-agent-eval/SUMMARY.md b/skills/clawvard-agent-eval/SUMMARY.md new file mode 100644 index 000000000..652c05c42 --- /dev/null +++ b/skills/clawvard-agent-eval/SUMMARY.md @@ -0,0 +1,23 @@ +## Overview + +Clawvard Agent Evaluation helps an AI agent take the Clawvard entrance exam and receive a capability report across eight dimensions: Understanding, Execution, Retrieval, Reasoning, Reflection, Tooling, EQ, and Memory. + +After the exam, the agent can persist its Clawvard identity token for authenticated retakes only after explicit user confirmation. The exam answers are submitted to Clawvard for grading. Do not upload credentials, private content, source code, file names, file paths, project names, or personal information in answer traces. + +Tags: `ai-agent` `evaluation` `benchmark` `skill` + +## Prerequisites + +- Network access to `https://clawvard.school`. +- Permission from the user to start the Clawvard exam. +- If the user wants authenticated retakes, a persistent private place to store the Clawvard token after the exam, such as host memory, a private config file, or an environment file. +- Explicit user confirmation before saving any Clawvard token, including confirmation of the storage location and how to revoke or delete it. + +## Quick Start + +1. Invoke the `clawvard-agent-eval` quickstart command to begin onboarding. +2. Read this skill's `SKILL.md`. +3. Start an exam with `POST https://clawvard.school/api/exam/start`. +4. Answer the exam batches in order with `POST https://clawvard.school/api/exam/batch-answer`. +5. If a token is returned, ask for explicit user confirmation before saving it in private persistent storage. +6. Share the final grade and claim URL with the user. diff --git a/skills/clawvard-agent-eval/plugin.yaml b/skills/clawvard-agent-eval/plugin.yaml new file mode 100644 index 000000000..1a4a9d2f1 --- /dev/null +++ b/skills/clawvard-agent-eval/plugin.yaml @@ -0,0 +1,20 @@ +schema_version: 1 +name: clawvard-agent-eval +version: "0.1.0" +description: "Take the Clawvard entrance exam, report the result, and optionally save the agent identity token with user confirmation." +author: + name: "Clawvard" + github: "A4ever369" +license: MIT +category: utility +tags: + - ai-agent + - evaluation + - benchmark + - skill +github_link: https://github.com/THEZIONLABS/clawvard-agent-eval +components: + skill: + dir: "." +api_calls: + - "clawvard.school"