Skip to content

Always generate thumbnails for content nodes and topics#676

Draft
rtibbles wants to merge 1 commit into
learningequality:mainfrom
rtibbles:thumbnail_stage
Draft

Always generate thumbnails for content nodes and topics#676
rtibbles wants to merge 1 commit into
learningequality:mainfrom
rtibbles:thumbnail_stage

Conversation

@rtibbles

@rtibbles rtibbles commented Jun 7, 2026

Copy link
Copy Markdown
Member

Summary

Thumbnails are now always generated for content nodes and topics that don't provide one. Generation is cached and low-cost, so it is no longer opt-in.

Breaking change: the config.THUMBNAILS gate, --thumbnails CLI flag, generate-missing-thumbnails settings shim, and derive_thumbnail node kwarg are removed. Chefs passing derive_thumbnail will get a TypeError.

Topic tiles are generated in a sequential post-pass after tree processing, fixing a latent race where the thread-pool executor could process a topic before its children's thumbnails existed.

References

Related (not fixed): #490 — topics whose children are StudioContentNodes still get no tile, since their thumbnails exist only on Studio.

Reviewer guidance

Test coverage: tests/pipeline/test_thumbnails.py (new pipeline stage) and tests/test_thumbnails.py (node-level generation + deferred topic tiles). For an end-to-end check, run any chef with no flags — content nodes get generated thumbnails and topics get tiles; pass thumbnail=... on a node and confirm it wins over generation.

Areas deserving extra attention:

  • ricecooker/utils/pipeline/file_handler.py:136write_file no longer runs the empty-file check and storage copy when the body raises; this is framework-wide (all transfer/convert/thumbnail handlers). All 11 call sites were checked against the new semantics, but it has the widest blast radius in the PR.
  • ricecooker/managers/tree.py:114 — the deferred-thumbnail post-pass relies on all_nodes being children-first, runs sequentially after the concurrent executor, and registers tile files into file_map for upload.
  • ricecooker/classes/nodes.py:266has_thumbnail() is now preset-based plus an explicit self.thumbnail check; this changes thumbnail-detection semantics for legacy File classes too (classes/files.py:105).

AI usage

This PR was built with Claude Code: the design spec and implementation plan were drafted collaboratively and human-reviewed, implementation ran task-by-task from the plan (tests written first), and the diff went through three AI-assisted review rounds with fixes before commit. I reviewed the spec, plan, and diff at each checkpoint and verified the full test suite and lint locally.

🤖 Generated with Claude Code

Add a thumbnail extraction stage to the file pipeline (pdf, epub,
html5/kpub zips, mp4, webm) that emits a PNG FileMetadata alongside the
source file, skipped via the NODE_HAS_THUMBNAIL context key when the
node already provides a thumbnail. Thumbnail detection becomes
preset-based (File.is_thumbnail, Node.has_thumbnail) instead of
isinstance checks, so pipeline-generated thumbnails count.

Node-level generation is extracted into generate_missing_thumbnail(),
called from process_files for content nodes; topics defer to a new
sequential ChannelManager.generate_deferred_thumbnails() post-pass,
fixing the latent race where the concurrent executor could tile a topic
before its children finished.

Breaking change: the config.THUMBNAILS gate and derive_thumbnail node
kwarg are removed - generation is now always on for nodes without a
provided thumbnail. The --thumbnails CLI flag and the thumbnails /
generate-missing-thumbnails settings remain as deprecated no-ops that
emit a warning.

Also: write_file no longer masks in-flight exceptions or copies partial
files to storage; create_image_from_pdf_page gains max_width to cap
render size (downscale-only: pages that would render narrower than the
cap are not upscaled, with page width read via PyPDF2); guard
get_thumbnail_preset against kind-less nodes matching the
channel_thumbnail preset.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant