skill-evaluation

Star

Here are 11 public repositories matching this topic...

Evol-ai / SkillCompass

Star

Evaluate agent skill quality. Find the weakest link. Fix it. Prove it worked.

ai-agents skill-evaluation anthropic agent-skills claude-code skill-rating claude-code-skill openclaw openclaw-skill

Updated Apr 23, 2026
JavaScript

AndrewNgGirl / SkillLens

Star

Open-source self-hosted web tool for evaluating Agent Skills with rubric scores, Deep Review, and improvement suggestions.

typescript skills skill nextjs self-hosted developer-tools cursor ai-agents claude rubric skill-evaluation llm agent-skills claude-code openclaw

Updated May 7, 2026
TypeScript

Evaluation framework for LLM knowledge inputs — prompts, RAG corpora, skills, agent workflows. Fix the model, vary the artifact. Built-in statistical rigor: bootstrap CI, Krippendorff α, length-debias, saturation curves.

benchmark ai evaluation-framework claude knowledge-engineering skill-evaluation llm prompt-engineering prompt-testing llm-evaluation rag-evaluation llm-judge claude-code agent-evaluation bootstrap-ci krippendorff-alpha evaluation-as-code multi-judge-ensemble

Updated May 7, 2026
TypeScript

SirryChen / triage-skill-creator

Star

Triage-trainer：从零为您的个人助手构建定制化的导诊 Skill，赋予精准的就诊科室推荐能力

skill triage skill-evaluation skill-creator agent-trainer

Updated Mar 28, 2026
HTML

jeremylongshore / j-rig-skill-binary-eval

Sponsor

Star

Binary-criteria evaluation harness for Claude skills with planned extension to plugins, agents, and MCP servers. Score every change yes/no across 7 layers — package integrity, trigger quality, functional quality, regression protection, baseline value, model variance, rollout safety. Never gradients.

mcp regression-testing skill-evaluation ai-evaluation llm-eval claude-code plugin-testing eval-harness agent-eval binary-criteria

Updated May 8, 2026
TypeScript

saniyaacharya04 / interviewforge

Star

AI-powered mock interview platform with automated scoring, role-based questions, modern React UI, FastAPI backend, and a fully implemented freemium SaaS architecture.

machine-learning-application technical-interviews fastapi skill-evaluation ai-tools assessment-platform interview-simulator ai-scoring

Updated Dec 13, 2025
Python

WilliamWJHuang / agent-skill-evaluator

Star

Evaluate agent SKILL.md files for structure, security, quality, and domain correctness.

linter quality-assurance security-analysis ai-agents skill-evaluation agent-skills

Updated Apr 18, 2026
Python

duck-ai-yy / skill-safety-reviewer

Star

A skill that reviews whether skills found online are safe to install for non-tech-background developers

ai-safety cowork skill-evaluation tool-evaluation claude-ai claude-skills safety-reviewer

Updated Mar 22, 2026

yadinae / agent-evolution

Star

🧬 Agent 自我进化系统 - 基于数据驱动的 AI Agent 能力提升平台 | ✨ 任务监控/技能评估/智能调度/自动进化 | 📊 95%+ 测试覆盖，<20ms 延迟

python open-source machine-learning automation data-driven performance-monitoring self-improvement skill-evaluation ai-optimization ai-agent intelligent-scheduling agent-evolution

Updated May 2, 2026
JavaScript

zinan92 / repo-evals

Star

Claim-first 仓库评测框架。in: owner/repo + repo_type → out: eval scaffold + 可靠性桶 (unusable/usable/reusable/recommendable)

bash evaluation testing-framework skill-evaluation claim-first

Updated May 8, 2026
HTML

SEKIRO009 / skillsentry

Star

Detect malicious code and security risks in AI skill files before installation to protect AI agents from hidden threats and obfuscation techniques.

performance-tracking learning-platform skills-assessment skill-management workforce-development skill-evaluation hr-software training-management employee-training online-assessments talent-development career-growth skill-gap-analysis competency-mapping training-dashboard

Updated May 10, 2026
Python

Improve this page

Add a description, image, and links to the skill-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the skill-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

skill-evaluation

Here are 11 public repositories matching this topic...

Evol-ai / SkillCompass

AndrewNgGirl / SkillLens

lizhiyao / oh-my-knowledge

SirryChen / triage-skill-creator

jeremylongshore / j-rig-skill-binary-eval

saniyaacharya04 / interviewforge

WilliamWJHuang / agent-skill-evaluator

duck-ai-yy / skill-safety-reviewer

yadinae / agent-evolution

zinan92 / repo-evals

SEKIRO009 / skillsentry

Improve this page

Add this topic to your repo