Skip to content

fix(markdown): stable round-trip for tables, captions, and audio#2720

Merged
nperez0111 merged 4 commits intomainfrom
feat/markdown-table-headers
May 7, 2026
Merged

fix(markdown): stable round-trip for tables, captions, and audio#2720
nperez0111 merged 4 commits intomainfrom
feat/markdown-table-headers

Conversation

@nperez0111
Copy link
Copy Markdown
Contributor

@nperez0111 nperez0111 commented May 7, 2026

Summary

Fixes #739.

Fixes several markdown round-trip issues so that exporting BlockNote blocks to markdown and parsing them back produces a stable result.

Rationale

  • Tables (Parsing markdown table is wrong #739): A headerless BlockNote table grew an empty row on every save → load cycle, because the empty header row required by markdown table syntax was being parsed back as a real first row.
  • Captions: Image and video captions were silently lost (or duplicated as a stray paragraph), and the caption prop ended up in name.
  • Audio: The audio block had no markdown representation — it was emitted as [](url) and parsed back as a plain link, losing the block entirely.

Changes

  • markdownToHtml.ts: when every header cell is blank, emit <tbody>-only (no <thead>), so a markdown table with an empty header parses as a headerless table.
  • htmlToMarkdown.ts: image/video with a caption serialize to raw <figure><img|video><figcaption>...</figcaption></figure> HTML, so the caption survives the round-trip via BlockNote's existing figure parser. The descriptor (alt / data-name) is dropped when it would duplicate the caption text. Image/video without a caption keep the prettier ![alt](url) form.
  • htmlToMarkdown.ts: audio always serializes to raw <audio> HTML (with <figure> wrapping when captioned).
  • markdownToHtml.ts: video URLs in ![alt](url) route alt → data-name. audio added to the HTML block-tag set.
  • parseVideoElement.ts: read data-name into name, mirroring parseImageElement reading alt.
  • New round-trip snapshot tests: markdown/defaultBlocks (covers every default block type), markdown/tableWithHeaderRow, and a new image/urlOnly export case to lock in that figure-wrapping is only used when needed.
  • The existing markdown/table snapshot is regenerated — the old snapshot encoded the bug (3 rows + headerRows: 1 from a 2-row input).

Impact

Markdown export output changes for media blocks:

  • Captioned images/videos now serialize as raw <figure> HTML rather than ![alt](url) plus a caption paragraph.
  • Audio blocks serialize as <audio> HTML rather than [](url) link syntax.
  • Headerless tables no longer have a leading empty header row.

The pretty ![alt](url) markdown is preserved for the simple no-caption case.

Testing

All existing tests pass plus the new snapshots. 651/651 conversion tests, 844/844 in tests/, 422 + 3 skipped in packages/core.

Checklist

  • Code follows the project's coding standards.
  • Unit tests covering the new feature have been added.
  • All existing tests pass.
  • The documentation has been updated to reflect the new feature

Summary by CodeRabbit

  • Bug Fixes

    • Improved audio/video HTML emission for reliable round-trip conversion.
    • Refined figure and caption handling to avoid duplicate descriptors and emit simpler markup when appropriate.
    • Added support for headerless tables during markdown↔HTML round trips.
    • Preserved video name metadata during parsing and conversion.
  • Tests

    • Expanded round-trip conversion tests including media-only images, tables, and comprehensive default-block scenarios.

- Headerless tables no longer grow an empty row each save (#739): emit no
  `<thead>` when every header cell is blank.
- Image/video captions now round-trip via raw `<figure>`+`<figcaption>` HTML
  so the `caption` prop is preserved (and not duplicated as a stray
  paragraph below the image). The descriptor (alt / data-name) is dropped
  when it would duplicate the caption text.
- Audio blocks serialize to raw `<audio>` HTML instead of `[](url)` link
  syntax, so they round-trip back into an audio block instead of being
  lost as a link.
- Markdown video parser routes alt text into `data-name` and
  `parseVideoElement` reads it back, mirroring image alt → name.
- `audio` added to HTML block-tag set so a standalone `<audio>` line is
  recognized as a block.
- Adds round-trip snapshot tests covering all default blocks
  (`markdown/defaultBlocks`), tables with header rows
  (`markdown/tableWithHeaderRow`), and the no-name/no-caption image case
  (`image/urlOnly`).
@vercel
Copy link
Copy Markdown

vercel Bot commented May 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
blocknote Ready Ready Preview May 7, 2026 4:48am
blocknote-website Ready Ready Preview May 7, 2026 4:48am

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a3b1e5ea-e40b-45bc-8b11-041122b97ac7

📥 Commits

Reviewing files that changed from the base of the PR and between 7512b1c and f92ca8e.

📒 Files selected for processing (1)
  • tests/src/unit/core/formatConversion/exportParseEquality/exportParseEqualityTestInstances.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/src/unit/core/formatConversion/exportParseEquality/exportParseEqualityTestInstances.ts

📝 Walkthrough

Walkthrough

Audio now serializes as real <audio> HTML with escaped src; <figure> serialization was refactored to detect embedded img/video/audio and optional figcaption, emitting either markdown shorthand or full <figure> HTML while avoiding duplicate descriptors. Video parsing reads data-name. audio is treated as an HTML block on import and tables with empty headers omit <thead>. Tests for image-only export and markdown round-trip cases were added.

Changes

Media and Figure Serialization Round-Trip

Layer / File(s) Summary
Data Shape
packages/core/src/blocks/Video/parseVideoElement.ts
parseVideoElement now returns { url, previewWidth, name }, reading optional data-name.
Export Serialization
packages/core/src/api/exporters/markdown/htmlToMarkdown.ts
Audio blocks emit raw <audio ... controls> HTML with escaped src. serializeFigure rewritten to detect img/video/audio and figcaption, delegating to new serializeMediaFigure. New helpers escapeHtmlAttr and escapeHtmlText added. Duplicate descriptor emission suppressed when caption equals descriptor. Old simple serializeFigure removed.
Import Parsing / Wiring
packages/core/src/api/parsers/markdown/markdownToHtml.ts
parseImage/video HTML generation now prefers image alt for data-name (fallback to title). HTML_BLOCK_TAGS includes audio so <audio> is emitted as raw HTML blocks. emitTable omits <thead> when all header cells are empty to support headerless tables.
Tests
tests/src/unit/core/formatConversion/export/exportTestInstances.ts, tests/src/unit/core/formatConversion/exportParseEquality/exportParseEqualityTestInstances.ts
Added image/urlOnly export test. Added markdown/tableWithHeaderRow regression test and markdown/defaultBlocks round-trip snapshot covering many block types.

Sequence Diagram(s)

sequenceDiagram
    participant Exporter
    participant Parser
    participant VideoBlock
    participant Tests

    Exporter->>Exporter: detect media blocks
    Exporter->>Exporter: serialize audio as raw HTML (escape attrs)
    Exporter->>Exporter: serialize figure or shorthand based on caption and media
    Exporter->>Tests: emit Markdown/HTML output

    Tests->>Parser: feed emitted Markdown/HTML
    Parser->>Parser: treat audio as raw HTML block
    Parser->>VideoBlock: parse data-name into video block object
    Parser->>Tests: return reconstructed blocks for assertions
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • TypeCellOS/BlockNote#2624: Modifies the same markdown↔HTML converters (htmlToMarkdown.ts and markdownToHtml.ts) and related media serialization changes.
  • TypeCellOS/BlockNote#2719: Also touches media/figure serialization and audio emission logic in htmlToMarkdown.ts.

Poem

A rabbit hops through markup bright, 🐇
Audio sings from escaped src light,
Figures cradle caption and name,
Tables trim headers without shame,
Round-trips hum and everything's right. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed) when header cells are blank, preventing empty headers from being parsed as real rows, achieving stable round-trips for headerless tables.
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main changes: fixes for markdown round-trip stability covering tables, captions, and audio blocks.
Description check ✅ Passed The description covers summary, rationale, changes, impact, testing, and checklist sections with substantive detail; documentation update is unchecked but that is acceptable.
Linked Issues check ✅ Passed The PR addresses issue #739 by modifying markdownToHtml.ts to emit only
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing markdown round-trip issues: table header handling, caption serialization for images/videos, audio block representation, and related test cases are all aligned with the stated objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/markdown-table-headers

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@tests/src/unit/core/formatConversion/exportParseEquality/exportParseEqualityTestInstances.ts`:
- Around line 958-963: The test comment for the testCase named
"markdown/defaultBlocks" wrongly states that captions are dropped during
markdown round-trip; update the comment to remove "captions" from the list of
dropped features and instead note that captioned image/video/audio blocks now
survive the round-trip as raw <figure> HTML (or otherwise record that captions
are preserved as HTML). Edit the block comment above the testCase definition to
reflect this corrected behavior so the snapshot description matches current
functionality.
- Around line 715-740: The comment above the testCase is misleading: it states
"a table with no header row" but the testCase named
"markdown/tableWithHeaderRow" and the table content include headerRows: 1 and a
header row; update the comment to reflect that this is testing a table with a
header row (or alternatively change headerRows to 0 and remove header cells if
you intend a headerless test). Locate the testCase object (testCase.name
"markdown/tableWithHeaderRow", the table content with headerRows: 1) and either
correct the comment to describe a headered table round-trip or modify
headerRows/cells to create a truly headerless case to match the original issue.
🪄 Autofix (Beta)

❌ Autofix failed (check again to retry)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c2708b53-b872-47ef-a473-77f14590f68d

📥 Commits

Reviewing files that changed from the base of the PR and between 531ea32 and 9d81c2e.

⛔ Files ignored due to path filters (16)
  • tests/src/unit/core/formatConversion/export/__snapshots__/blocknoteHTML/image/urlOnly.html is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/html/image/urlOnly.html is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/markdown/audio/basic.md is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/markdown/audio/noName.md is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/markdown/image/basic.md is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/markdown/image/nested.md is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/markdown/image/noName.md is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/markdown/image/urlOnly.md is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/markdown/image/withCaption.md is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/markdown/video/withCaption.md is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/export/__snapshots__/nodes/image/urlOnly.json is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/exportParseEquality/__snapshots__/markdown/markdown/defaultBlocks.json is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/exportParseEquality/__snapshots__/markdown/markdown/table.json is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/exportParseEquality/__snapshots__/markdown/markdown/tableWithHeaderRow.json is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/exportParseEquality/__snapshots__/markdown/markdown/video.json is excluded by !**/__snapshots__/**
  • tests/src/unit/core/formatConversion/parse/__snapshots__/markdown/video.json is excluded by !**/__snapshots__/**
📒 Files selected for processing (5)
  • packages/core/src/api/exporters/markdown/htmlToMarkdown.ts
  • packages/core/src/api/parsers/markdown/markdownToHtml.ts
  • packages/core/src/blocks/Video/parseVideoElement.ts
  • tests/src/unit/core/formatConversion/export/exportTestInstances.ts
  • tests/src/unit/core/formatConversion/exportParseEquality/exportParseEqualityTestInstances.ts

The image-with-caption case now serializes to raw `<figure>` HTML and
parses back with `caption: "Caption"` preserved (previously the caption
was lost into a stray paragraph block).
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 7, 2026

Open in StackBlitz

@blocknote/ariakit

npm i https://pkg.pr.new/@blocknote/ariakit@2720

@blocknote/code-block

npm i https://pkg.pr.new/@blocknote/code-block@2720

@blocknote/core

npm i https://pkg.pr.new/@blocknote/core@2720

@blocknote/mantine

npm i https://pkg.pr.new/@blocknote/mantine@2720

@blocknote/react

npm i https://pkg.pr.new/@blocknote/react@2720

@blocknote/server-util

npm i https://pkg.pr.new/@blocknote/server-util@2720

@blocknote/shadcn

npm i https://pkg.pr.new/@blocknote/shadcn@2720

@blocknote/xl-ai

npm i https://pkg.pr.new/@blocknote/xl-ai@2720

@blocknote/xl-docx-exporter

npm i https://pkg.pr.new/@blocknote/xl-docx-exporter@2720

@blocknote/xl-email-exporter

npm i https://pkg.pr.new/@blocknote/xl-email-exporter@2720

@blocknote/xl-multi-column

npm i https://pkg.pr.new/@blocknote/xl-multi-column@2720

@blocknote/xl-odt-exporter

npm i https://pkg.pr.new/@blocknote/xl-odt-exporter@2720

@blocknote/xl-pdf-exporter

npm i https://pkg.pr.new/@blocknote/xl-pdf-exporter@2720

commit: f92ca8e

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

❌ Failed to clone repository into sandbox. Please try again.

…y/exportParseEqualityTestInstances.ts

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…ival

Captions on image/video/audio now survive round-trip via raw `<figure>`
HTML — they shouldn't have been listed as a dropped feature.
@nperez0111 nperez0111 merged commit 1c720f2 into main May 7, 2026
23 checks passed
@nperez0111 nperez0111 deleted the feat/markdown-table-headers branch May 7, 2026 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parsing markdown table is wrong

1 participant