Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions .agents/skills/protocol-migration/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Do not change protocol meaning.
Use `legacy/source.md` as the primary source when rewriting `README.md`.
Use `legacy/source.txt` only as a fallback when `legacy/source.md` looks malformed, incomplete, or unclear.
Use the PDF file in `legacy/` as the final reference source of truth for tables, figures, layout-dependent content, and anything still ambiguous.
Treat image references in `legacy/source.md` and files in `legacy/images/` as extracted protocol content, not decorative artifacts, until they have been reviewed.
If `legacy/source.md` and `legacy/source.txt` disagree, prefer `legacy/source.md` for general structure and prose, but use the original PDF as the final tie-breaker.

## Migration behavior
Expand All @@ -32,6 +33,23 @@ When converting legacy protocol content into the repository template:
- If any text does not fit cleanly into the template, place it under `# Migration notes` or `## Unplaced content`.
- Mark uncertainty with `CHECK:` instead of guessing.

## Images, figures, and image-based tables
PDF-to-Markdown conversion may extract protocol-relevant content as images, especially tables, thermocycler programs, reagent layouts, flow diagrams, gel/example images, or figure panels.

When migrating:

- Scan `legacy/source.md` for image references such as `![](images/...)`, and inspect `legacy/images/` for extracted images.
- Keep images that contain protocol content needed to perform or interpret the protocol.
- Omit only clearly decorative images such as logos, icons, page chrome, ornamental separators, or duplicated images that add no protocol content.
- Prefer converting image-based tables into Markdown tables when all headers, rows, values, units, grouping, and notes are legible and unambiguous.
- Preserve row grouping and cycle counts from thermocycler/program tables. If Markdown cannot represent the original grouping cleanly, add a short note or use repeated values rather than losing meaning.
- Do not OCR or transcribe illegible values by guesswork. If any value, header, grouping, or placement is uncertain, keep the image and add `CHECK:`.
- If an image contains non-tabular protocol content that cannot be safely converted to text, include the image in `README.md`.
- Place converted tables or retained images at the same logical location as the source image, near the step or section they support.
- In `README.md`, image paths must be valid from the repository root. Change extracted paths from `images/<file>` to `legacy/images/<file>` unless the image has deliberately been moved.
- Use descriptive alt text, for example `![Thermocycler program](legacy/images/page-3-image-2.png)`, not empty alt text.
- Mention in `# Migration notes` which extracted images were converted to Markdown tables, which were retained as images, and which were omitted as non-protocol/decorative.

## Allowed formatting normalization
You may normalize formatting only when the meaning is unchanged and unambiguous:

Expand All @@ -49,6 +67,7 @@ You may normalize formatting only when the meaning is unchanged and unambiguous:
- Normalize bullet formatting and markdown table formatting.
- Normalize heading structure to match the repository template.
- For reaction mixes and anything tabular, place them inside a table as in template.
- For image-based tables, convert to Markdown tables wherever this is legible and unambiguous; otherwise retain the image at the correct protocol location.
- Normalize markdown headings, bullets, and tables.
- "Note" or "NOTE" or "NB" or "Optional" or "Recommended" or "Warning" are normalized to start with `>` (example `> **Note**`) and are placed immediately after the step they refer to, or at the end of the protocol if they clearly refer to the whole protocol.
- Remove empty columns from tables.
Expand All @@ -67,6 +86,7 @@ You may normalize formatting only when the meaning is unchanged and unambiguous:
- Do not replace one reagent name with another.
- Do not remove repeated warnings or notes.
- Do not omit unmapped text.
- Do not omit protocol-relevant images, image-based tables, figures, diagrams, or visual instructions.

## Output requirements
- edit `README.md`
Expand All @@ -86,6 +106,9 @@ You may normalize formatting only when the meaning is unchanged and unambiguous:
- template_version from `template-metadata.yml`
- ambiguous mappings
- normalized formatting changes
- extracted images converted to Markdown tables
- extracted images retained in `README.md`
- extracted images omitted because they were decorative or duplicated non-protocol content
- content copied verbatim but not confidently placed
- keep the template badge at the top
- keep ![Created with ulelab Protocol Template](https://img.shields.io/badge/created%20with-ulelab%20Protocol%20Template-blue) at the top of the file
Expand All @@ -97,7 +120,10 @@ After drafting, verify the migration against the source:
- compare the migrated `README.md` against `legacy/source.md`
- compare any malformed, incomplete, or ambiguous passages against `legacy/source.txt`
- compare the migrated `README.md` against the PDF in `legacy/` for tables, figures, layout-dependent content, and any remaining ambiguity
- compare every protocol-relevant image reference in `legacy/source.md` and every relevant file in `legacy/images/` against the migrated `README.md`
- check that all protocol steps, notes, warnings, reagent names, quantities, temperatures, timings, and conditions are still present
- check that image-based tables were converted accurately or retained as images with valid `legacy/images/...` paths
- check that no protocol-relevant figure, table image, diagram, gel/example image, or visual instruction was silently omitted
- check that no source content has been silently omitted, merged, or reordered without justification
- check any tables, layout-dependent content, or ambiguous sections against the PDF in `legacy/`
- leave `CHECK:` anywhere the mapping is uncertain rather than guessing
Expand All @@ -108,6 +134,8 @@ Verification checklist:
- no protocol steps or warnings were omitted
- no values were invented or made more precise than in the source
- tables and layout-dependent content were checked against the PDF in `legacy/`
- protocol-relevant extracted images were either converted to Markdown tables or retained at the correct location
- retained image links resolve from `README.md`
- any uncertain mappings are marked with `CHECK:`
- any meaningful normalization choices are noted in `# Migration notes`

Expand Down
27 changes: 27 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Do not change protocol meaning.
Use `legacy/source.md` as the primary source when rewriting `README.md`.
Use `legacy/source.txt` only as a fallback when `legacy/source.md` looks malformed, incomplete, or unclear.
Use the PDF file in `legacy/` as the final reference source of truth for tables, figures, layout-dependent content, and anything still ambiguous.
Treat image references in `legacy/source.md` and files in `legacy/images/` as extracted protocol content, not decorative artifacts, until they have been reviewed.
If `legacy/source.md` and `legacy/source.txt` disagree, prefer `legacy/source.md` for general structure and prose, but use the original PDF as the final tie-breaker.

## Migration behavior
Expand All @@ -26,6 +27,22 @@ When converting legacy protocol content into the repository template:
- Preserve the step order from the source unless the source clearly indicates otherwise.
- Preserve exact reagent and equipment names unless only formatting is changing.

## Images, figures, and image-based tables
PDF-to-Markdown conversion may extract protocol-relevant content as images, especially tables, thermocycler programs, reagent layouts, flow diagrams, gel/example images, or figure panels.

When migrating:
- Scan `legacy/source.md` for image references such as `![](images/...)`, and inspect `legacy/images/` for extracted images.
- Keep images that contain protocol content needed to perform or interpret the protocol.
- Omit only clearly decorative images such as logos, icons, page chrome, ornamental separators, or duplicated images that add no protocol content.
- Prefer converting image-based tables into Markdown tables when all headers, rows, values, units, grouping, and notes are legible and unambiguous.
- Preserve row grouping and cycle counts from thermocycler/program tables. If Markdown cannot represent the original grouping cleanly, add a short note or use repeated values rather than losing meaning.
- Do not OCR or transcribe illegible values by guesswork. If any value, header, grouping, or placement is uncertain, keep the image and add `CHECK:`.
- If an image contains non-tabular protocol content that cannot be safely converted to text, include the image in `README.md`.
- Place converted tables or retained images at the same logical location as the source image, near the step or section they support.
- In `README.md`, image paths must be valid from the repository root. Change extracted paths from `images/<file>` to `legacy/images/<file>` unless the image has deliberately been moved.
- Use descriptive alt text, for example `![Thermocycler program](legacy/images/page-3-image-2.png)`, not empty alt text.
- Mention in `# Migration notes` which extracted images were converted to Markdown tables, which were retained as images, and which were omitted as non-protocol/decorative.

## Allowed formatting normalization
You may normalize formatting only when the meaning is unchanged and unambiguous:
- Add a space between numbers and units.
Expand All @@ -42,6 +59,7 @@ You may normalize formatting only when the meaning is unchanged and unambiguous:
- Normalize bullet formatting and markdown table formatting.
- Normalize heading structure to match the repository template.
- For reaction mixes and anything tabular, place them inside a table as in template.
- For image-based tables, convert to Markdown tables wherever this is legible and unambiguous; otherwise retain the image at the correct protocol location.
- Normalize markdown headings, bullets, and tables.
- "Note" or "NOTE" or "NB" or "Optional" or "Recommended" or "Warning" are normalized to start with `>` (example `> **Note**`) and are placed immediately after the step they refer to, or at the end of the protocol if they clearly refer to the whole protocol.
- Remove empty columns from tables.
Expand All @@ -60,6 +78,7 @@ You may normalize formatting only when the meaning is unchanged and unambiguous:
- Do not replace one reagent name with another.
- Do not remove repeated warnings or notes.
- Do not omit unmapped text.
- Do not omit protocol-relevant images, image-based tables, figures, diagrams, or visual instructions.

## Output requirements
When drafting a migrated protocol:
Expand All @@ -77,6 +96,9 @@ When drafting a migrated protocol:
- template_version from `template-metadata.yml`.
- ambiguous mappings.
- normalized formatting changes.
- extracted images converted to Markdown tables.
- extracted images retained in `README.md`.
- extracted images omitted because they were decorative or duplicated non-protocol content.
- content copied verbatim but not confidently placed.
- Keep ![Created with ulelab Protocol Template](https://img.shields.io/badge/created%20with-ulelab%20Protocol%20Template-blue) at the top of the file.
- Delete the "Template repository: Click `Use this template` to create a new protocol repo..." note.
Expand All @@ -88,7 +110,10 @@ After drafting, verify the migration against the source:
- compare the migrated `README.md` against `legacy/source.md`
- compare any malformed, incomplete, or ambiguous passages against `legacy/source.txt`
- compare the migrated `README.md` against the PDF in `legacy/` for tables, figures, layout-dependent content, and any remaining ambiguity
- compare every protocol-relevant image reference in `legacy/source.md` and every relevant file in `legacy/images/` against the migrated `README.md`
- check that all protocol steps, notes, warnings, reagent names, quantities, temperatures, timings, and conditions are still present
- check that image-based tables were converted accurately or retained as images with valid `legacy/images/...` paths
- check that no protocol-relevant figure, table image, diagram, gel/example image, or visual instruction was silently omitted
- check that no source content has been silently omitted, merged, or reordered without justification
- check any tables, layout-dependent content, or ambiguous sections against the PDF in `legacy/`
- leave `CHECK:` anywhere the mapping is uncertain rather than guessing
Expand All @@ -99,6 +124,8 @@ Verification checklist:
- no protocol steps or warnings were omitted
- no values were invented or made more precise than in the source
- tables and layout-dependent content were checked against the PDF in `legacy/`
- protocol-relevant extracted images were either converted to Markdown tables or retained at the correct location
- retained image links resolve from `README.md`
- any uncertain mappings are marked with `CHECK:`
- any meaningful normalization choices are noted in `# Migration notes`

Expand Down
60 changes: 60 additions & 0 deletions .github/workflows/fix-protocol-style.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
name: fix-protocol-style

on:
workflow_dispatch:
inputs:
base_branch:
description: Branch to fix and open a PR against
required: true
default: main

permissions:
contents: write
pull-requests: write

concurrency:
group: fix-protocol-style-${{ github.event.inputs.base_branch || 'main' }}
cancel-in-progress: false

jobs:
fix-style:
runs-on: ubuntu-latest

steps:
- name: Check out target branch
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ github.event.inputs.base_branch }}

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"

- name: Run unit tests
run: |
python -m unittest discover -s tests -p 'test_*.py'

- name: Apply style fixer
run: |
python scripts/fix_protocol_style.py README.md

- name: Verify README style after fixing
run: |
python scripts/validate_protocol_style.py README.md

- name: Create pull request with style fixes
uses: peter-evans/create-pull-request@v7
with:
base: ${{ github.event.inputs.base_branch }}
branch: automation/fix-protocol-style/${{ github.event.inputs.base_branch }}
delete-branch: true
commit-message: Normalize protocol README style
title: Normalize protocol README style
body: |
This PR was opened automatically by the style-fix workflow.

It applies deterministic README style fixes from `scripts/fix_protocol_style.py` and re-validates the result with `scripts/validate_protocol_style.py`.
add-paths: |
README.md
6 changes: 6 additions & 0 deletions .github/workflows/validate-protocol.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,23 +6,29 @@ on:
- main
paths:
- README.md
- scripts/fix_protocol_style.py
- scripts/validate_protocol.py
- scripts/validate_protocol_content.py
- scripts/validate_protocol_style.py
- tests/test_fix_protocol_style.py
- tests/test_validate_protocol_content.py
- tests/test_validate_protocol_style.py
- .github/workflows/validate-protocol.yml
- .github/workflows/fix-protocol-style.yml
push:
branches:
- main
paths:
- README.md
- scripts/fix_protocol_style.py
- scripts/validate_protocol.py
- scripts/validate_protocol_content.py
- scripts/validate_protocol_style.py
- tests/test_fix_protocol_style.py
- tests/test_validate_protocol_content.py
- tests/test_validate_protocol_style.py
- .github/workflows/validate-protocol.yml
- .github/workflows/fix-protocol-style.yml
workflow_dispatch:

jobs:
Expand Down
Loading
Loading