Human-authored research priorities. Read by the agent each iteration. Last updated: 2026-03-21.
- Validate restructured patterns — baseline all 9 patterns with new Signature/Guidance structure
- Adaptiveness gaps — patterns scoring low on the new adaptiveness dimension
- Visual quality — patterns below 0.80 visual score
- Guidance clarity — templates with ambiguous defaults that cause harmful divergence
- Immutable:
style-reference.mdInvariants section,SKILL.mdDesign Philosophy - Immutable style calls:
sns.set_theme(font_scale=1.0, style="whitegrid", font="DejaVu Sans")andsns.despine(left=True, bottom=True)— never modify - No pattern deletions — add/modify only
- Single file scope: only the target
patterns/PN-*.mdfile is modified per experiment - Signature items: max 4 per pattern (enforced by signature_penalty)
- Minimum composite threshold: 0.60 (compliance × weighted layers × signature_penalty)
- Target: all 9 patterns above 0.60; stretch goal 0.80
- Diminishing returns: patterns above 0.90 are frozen — skip to next-lowest
- Compliance must be 1.0 — fix compliance failures before optimizing other layers
- Note (v2 scoring): Scoring overhauled 2026-03-21. Visual rubric restructured (10 checks, 4 deliberately hard), refinement/adaptiveness graduated to 0/1/2/3 scales, worst-of-3 floor, quadratic signature penalty. Expected baseline: 0.50–0.75 (v1 baselines were 0.90–1.00 due to ceiling effects). See
docs/specs/autoresearch-v2/for rationale. Old results.tsv rows are v1 scores — re-baseline before comparing.
When improving a pattern, try these approaches in order:
- Simplification — remove Signature items that restate invariants or template code. Fewer items → better signature_penalty → higher composite. The simplest specification that achieves quality is preferred.
- Guidance tuning — clarify Guidance defaults with "default X; consider Y when Z" language. Good guidance enables adaptiveness without over-constraining.
- Template alignment — ensure template code matches Signature claims. Inconsistencies between template and Signature are the highest-leverage fixes.
- Parameter defaults — replace vague descriptions with sensible defaults. Use "default" language to allow data-driven adaptation. Avoid mandating exact values in rules.
- Palette refinement — ensure palette name + slice indices are explicit
- Annotation guidance — encourage data-specific annotations (Design Philosophy P6). Never remove annotations solely to improve scores.
- Re-read the discard history for this pattern in
results.tsv— avoid repeating failed approaches - Try a fundamentally different strategy (e.g., if guidance tuning isn't working, try simplification)
- Skip to the next-lowest pattern and return later with fresh context
- If 10+ consecutive discards: the pattern may need a full rewrite via
/variation-analysis, not incremental fixes