You are working in a docs-first repo workflow.
Workflow: PLAN_PR → BUILD_PR → APPLY_PR
PR names MUST follow:
PR_<YYJJJ>_<###>-<short-description>
Where:
YY= year (2 digit)JJJ= Julian day (001–365)###= sequence for the day (001+)
Example:
PR_26124_001-palette-baselinePR_26124_002-tool-fix-asset-manager
Rules:
- Must be unique per day
- Must be sortable
- Description must be short and hyphenated
- Do NOT reuse old
PR_11_*format for new PRs
ChatGPT no longer creates PLAN_PR, BUILD_PR, APPLY_PR docs, ZIP bundles, or implementation code.
ChatGPT produces only:
- Codex command
- Commit comment
- What Playwright is testing
- What the user should test manually
ChatGPT must not:
- create ZIP files
- reference ZIP delivery
- produce PLAN/BUILD/APPLY docs
- write implementation code unless explicitly requested
Codex creates:
- PLAN_PR docs
- BUILD_PR docs
- APPLY_PR docs when needed
- repo-structured ZIP bundles
- implementation changes
- Playwright/test updates when required
- review artifacts for ChatGPT code review
Codex must place detailed content in:
docs/pr/*docs/dev/codex_commands.mddocs/dev/commit_comment.txtdocs/dev/reports/*
User:
- runs Codex
- validates results
- commits approved changes
- uploads deltas/reports when ChatGPT review is needed
- One PR purpose only
- Smallest scoped valid change
- BUILD must be one-pass executable
- No vague wording
- No repo-wide scanning unless required
- Do not expand scope beyond the PR
- Do not modify
start_of_dayfolders unless requested
Newer appended sections override earlier overlapping rules.
When rules overlap, use the most specific current section as authoritative.
Allowed change scope is PR-specific.
Unless a PR states otherwise, keep changes limited to:
tools/preview-generator-v2/*common/*docs/dev/reports/*
Do not modify unrelated files.
ChatGPT MUST output ONLY:
- Codex command
- Commit comment
- What Playwright is testing
- What to test manually
ChatGPT responses must:
- print a little detail about the PR, 1–3 lines only
- not present options
- assume correct path and proceed
- not create ZIPs
- not reference ZIP delivery
- keep chat response minimal
Format:
<description> - <PR info>
Example:
Normalize palette contract to manifest SSoT and remove tool-level schema drift - PR_26124_001-palette-baseline
Every PR must state:
Playwright impacted: Yes/No
Playwright impacted is Yes when the PR changes:
- tool runtime behavior
- UI controls or interactions
- workspace or toolState flows
- capture or rendering paths
If Playwright impacted is Yes:
npm run test:workspace-v2must pass.- the Playwright section must state what behavior is validated
- the Playwright section must state expected pass behavior
- the Playwright section must state expected fail behavior
If Playwright impacted is No:
- include
No Playwright impact. This PR is docs/workflow only. - for pure refactors, justify why behavior is unchanged
Playwright is not required for:
- docs-only PRs
- naming/formatting-only PRs
- pure refactors with no behavior change, when justified
Default Playwright command:
npm run test:workspace-v2
Playwright is the required validation gate for Workspace V2 and toolState work.
The full samples smoke test rule remains separate and runs only when broadly impacted.
Every Codex PR must produce review artifacts so ChatGPT can review the exact code changes.
Codex must create:
docs/dev/reports/codex_review.diffdocs/dev/reports/codex_changed_files.txt
codex_review.diff must contain:
git diff --cached- or, if files are not staged,
git diff
codex_changed_files.txt must contain:
git status --shortgit diff --stat
Rules:
- Do not add pre-commit hooks
- Do not pause commits
- Do not add dependencies
- Do not change runtime behavior just to create review artifacts
When user asks for code review, they should upload:
- PR delta ZIP
codex_review.diffcodex_changed_files.txt
Every PR must include:
- exact manual validation steps
- expected outcome
- any known out-of-scope checks
Manual test steps must not claim sample launch is required until sample JSON files are schema-compliant.
Current sample validation rule:
- sample launch is out-of-scope until sample JSON is updated to match schema
- sample validation will happen in a dedicated sample alignment phase
If the user says NEXT:
- Look for the highest completed or referenced PR in the conversation.
- Increment to the next logical PR using the current PR naming standard.
- If sequence is unclear, STOP and ask for clarification.
Use the current naming standard:
PR_<YYJJJ>_<###>-<short-description>
Do NOT continue old PR_11_* naming for new work.
If the user says:
Run full workflow for <PR_NAME>
or:
NEXT
Then ChatGPT must:
- Determine the next PR.
- Provide a compact Codex command.
- Provide commit comment.
- Provide what Playwright is testing.
- Provide what the user should test manually.
Do not ask for confirmation unless ambiguity exists.
ZIP creation is handled by Codex only.
ChatGPT must not:
- create ZIP files
- link ZIP files
- reference ZIP delivery as something ChatGPT produced
Codex must produce ZIP artifacts when required by the repo workflow.
Codex ZIPs must:
- be repo-structured
- preserve exact repo-relative paths
- be placed under
<project folder>/tmp/ - use the PR name in the ZIP filename
- contain no extra files outside the defined structure
Before Codex returns any ZIP, Codex must:
- Physically create the ZIP file.
- Verify the file exists on disk.
- Verify file size > 0.
- List contents to confirm correct repo structure.
- Use a new filename for every attempt.
- Place ZIP under
<project folder>/tmp/. - Never reuse a previous file handle or path.
If ZIP delivery fails more than once:
- Do NOT retry with the same name.
- Generate a new filename with timestamp.
- Rebuild ZIP from scratch.
- If still failing, STOP and provide inline content for manual application.
- Never pause for confirmation.
- Never present optional branches.
- Always proceed to the next logical step.
- Assume approval unless blocked.
- Roadmap lives at:
docs\dev\roadmaps\MASTER_ROADMAP_ENGINE.md - Only one roadmap.
- PRs must include something testable and improve the roadmap.
- Roadmap updates must be status-only unless explicitly requested.
- Valid roadmap status transitions:
[ ]→[.][.]→[x]
If a PR is doc-only, bundle it with the next smallest executable/testable change when appropriate.
- Do not create standalone showcase tracks in future roadmaps.
- Fold showcase importance into the main feature or sample title when needed.
- If roadmap content is moved to
PROJECT_INSTRUCTIONS.md, move it and do not delete it without relocation. - Ensure destination text exists before removing the source text.
- Preserve wording unless the PR explicitly requires rewriting.
- Keep roadmap handling status-only unless explicitly requested otherwise.
- Do not delete roadmap content during cleanup work.
- Do not modify roadmap content during cleanup work.
- Only update status
[ ],[.],[x]in roadmap content during cleanup work.
- Bundle PRs whenever it is safe and testable to reduce overall timeline and churn.
- Prefer fewer, higher-quality PR bundles over many small retries.
- Never ask whether to create the next Codex PR; assume it is required.
- Choose the correct path automatically.
- Reduce options presented.
- Complete the task fully and correctly.
- Update roadmap status every PR when execution-backed.
- Every PR must be testable.
Full samples smoke test takes about 20 minutes.
Do NOT run full samples test by default.
Run full samples test ONLY when:
- shared sample loader/framework is modified
- change impacts multiple samples broadly
- correctness cannot be verified with targeted tests
Prefer targeted validation:
- syntax checks for changed files
npm run test:workspace-v2- affected tool-specific tests
- affected sample-specific tests only when sample JSON is in scope
Every PR must document:
- whether full samples test was skipped or run
- reason for decision
Workspace manifest is the runtime contract.
Rules:
- workspace manifest is SSoT
- no
workspaceSession - no
games[] - tools own all tool payloads
- no tool payloads at manifest root
- no hidden fallback data
- no silent defaults
- schema validation is the only acceptance gate
Palette:
- exactly one active palette
- global workspace state
- lives at
tools.palette-browser - not a toolState
- not in Tool State Library
- baseline:
tools.palette-browser.swatches = []
Tool State:
- use
toolState, not Workspace V2 “session” terminology - saved tool states live under Workspace V2 tool state storage
- only one active tool state at a time
- toolState payloads must validate before use
- invalid toolState payloads must be rejected before render
- no partial render on invalid input
- no mutation of incoming payloadJson
Terminology:
savedSessions→savedToolStatesactiveSession→activeToolStatesessionId→toolStateIdSession Library→Tool State LibraryWorkspace Session→Workspace Tool StateCreate Session + Launch→Create & Open Tool StateNew Session→New Tool StateLoad Fixture→Load Tool Statesession payload→tool state payloadsaved session→saved tool stateactive session→active tool state
Do not rename unrelated browser/sessionStorage/auth/session concepts.
Samples are intentionally out-of-scope until tools are complete.
Rules:
- Do not touch sample JSON unless the PR is explicitly a sample alignment PR.
- Do not require sample launch validation during tool completion.
- Do not claim sample launch works until sample JSON has been updated to schema.
- Sample validation will happen after tool completion.
During tool completion:
- use the audit as the source for remaining tool gaps
- include exact list of failing tools from the audit
- say which tools are being fixed in the PR
- update audit/report status when execution-backed
- do not fix unlimited tools in one PR
- bundle only when tools are similar, low-risk, and covered by Playwright
Every tool completion PR must include:
- failing tools before
- tools fixed
- remaining failures after
- Playwright result
- manual validation steps
These rules are mandatory for every Codex BUILD execution:
- One concept = one name.
- Do not introduce alias variables or remapping chains such as
name1→nameA. - Do not create pass-through variables that only copy another variable.
- Do not create
a→b→cassignment chains. - Only introduce a variable when it transforms data, improves a complex expression, or is required for control flow.
- Preserve existing meaningful names unless a rename is required for correctness and is applied consistently.
- Do not add abstraction layers, helper functions, or broad refactors unless the BUILD explicitly requires them.
- Do not change unrelated files.
- Before finishing, review the diff and remove unused, redundant, pass-through, or alias variables.
Codex must validate any roadmap touch against these rules:
- never delete roadmap content
- never rewrite existing roadmap text
- only append new roadmap content when explicitly required by the PR
- only update status markers using:
[ ]→[.][.]→[x]
If roadmap status must change:
- edit the existing repo roadmap in place
- status-only transitions only
- place validation findings in
docs/dev/reports
If no roadmap status change is execution-backed:
- leave roadmap content untouched
The active UAT lane is Workspace V2 and tool completion.
Treat this as a recovery/stabilization lane only.
Do not expand into:
- broader games hub work
- unrelated tool registry rewrites
- unrelated template rewrites
- roadmap rewrites
- sample JSON alignment until tools are complete
Primitive-only arrays in JSON must use compact grouped formatting.
Primitive values are:
- string
- number
- boolean
- null
Valid compact form example:
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Do not compact:
- arrays of objects
- nested arrays
- complex structures
Do not change JSON contracts or semantics while applying array formatting.
PROJECT_INSTRUCTIONS.md lives at:
docs/dev/PROJECT_INSTRUCTIONS.md
Codex must always read docs/dev/PROJECT_INSTRUCTIONS.md from this path as the source of truth before executing repository workflow instructions.
All Codex commands must be multi-line and human-readable.
Do not provide single-line Codex commands.
Codex commands must use these sections:
Changes:
Validation:
Required reports:
Codex must ALWAYS produce a repo-structured ZIP for every PR.
The ZIP must follow the existing CODEX ZIP STANDARD.
The ZIP is required output, not optional.
Never allow <script> blocks inside HTML files.
Never allow <style> blocks inside HTML files.
HTML must not contain inline event handlers such as onclick, onchange, oninput, onsubmit, or similar.
All JavaScript must be external.
All CSS must be external.
Event wiring must live in external JavaScript classes or modules.
- One class per file.
- One control or section per class.
- App/root class coordinates only and must not own DOM logic or business logic.
- Controls do not reach into other controls directly.
- No
tools/shareddependency is allowed. - Shared UI behavior must use reusable classes.
- Do not duplicate shared UI behavior logic across controls or tools.
A PR is complete only when:
- scope is clean
- requested validation passes
- required reports exist
- manual test notes are present
No PR is complete with:
- unresolved console errors
- broken UI controls
- missing review artifacts
- unintended file changes
- No silent fallback.
- No hidden defaults.
- Failures must be visible, actionable, and logged.
- Invalid input must not partially render.
- Batch failures must identify the exact item that failed.
- Tools must use consistent header, NAV, panel, accordion, status, and action patterns.
- Left and right tool panels must use working accordion sections unless explicitly exempted.
- Dead accordion controls are prohibited.
- Left panel = user input/setup.
- Center = primary work surface.
- Right panel = output/status/logging/diagnostics.
- Status/log sections belong at the bottom of the right panel unless explicitly justified otherwise.
- Discover real files and directories.
- Never assume numeric sequences.
- Missing inputs are SKIP when batch processing, not FAIL, unless the selected single input is missing.
- Logs must identify resolved paths.
- Capture modes must be explicit.
- Do not silently fall back between capture modes.
- Capture failures must log the mode, target, and underlying error.
- Rendering tools must not claim OK when fallback or partial capture occurred.
- Batch operations must log per item.
- Each item must log
OK,WARN,FAIL, orSKIP. - One failed item must not stop the batch unless the failure is global.
- Summary must include written, failed, skipped, and warnings.
- Long-running batches must support a stop or cancel pattern when applicable.
- Batch operations must discover real files and directories and must not assume numeric folder sequences.
Playwright must validate behavior, not just page load.
When a PR impacts a tool, Playwright tests must cover:
- the primary user action, such as Generate Preview
- control state transitions, such as enabled and disabled states
- at least one failure case when applicable
Playwright tests must verify actual outcomes, not just element existence.
Playwright tests must not be limited to page loads without error.
Each PR must state what behavior is being validated.
Playwright should validate these tool behaviors when applicable:
- Workspace lifecycle
- reset/load/export/import
- palette baseline
- valid toolState payload render
- invalid toolState payload rejection
- no payload mutation
- active toolState integrity
- no reliance on sample JSON during tool completion
When tool-level Playwright exists:
- tool completion audit should align to Playwright results
- failures must identify tool name
- reports must clearly show PASS/FAIL per tool
When runtime JavaScript changes, Codex must produce a Playwright V8 coverage report.
The coverage report must list changed runtime JavaScript files.
Missing changed runtime JavaScript files in coverage must be reported as WARN, not FAIL.
Coverage report lines must start with coverage percentage in this format:
(xx%) <file-path> - <details>
Coverage is advisory unless a PR explicitly defines thresholds.
Do NOT require:
- full feature coverage
- 100% code coverage
- performance requirements
Codex must include the repo-structured ZIP in returned artifacts for user and ChatGPT review.
The ZIP must still follow the CODEX ZIP STANDARD.
ChatGPT must not claim code review was completed unless it inspected uploaded source, ZIP contents, or codex_review.diff.
Pattern-based or process-based review must be labeled as such.
New first-class tools must include registry, index, and NAV wiring where applicable.
New first-class tools must include Playwright launch coverage.
Tool registration must not rely on hidden defaults or silent fallback.
The official First-Class Tool V2 starter is:
tools/templates-v2/
Use the V2 naming consistently:
- Tool Template V2
- First-Class Tool Starter V2
- First-Class Tools Surface V2
- First-Class Tool V2