Always generate thumbnails for content nodes and topics#676
Draft
rtibbles wants to merge 1 commit into
Draft
Conversation
Add a thumbnail extraction stage to the file pipeline (pdf, epub, html5/kpub zips, mp4, webm) that emits a PNG FileMetadata alongside the source file, skipped via the NODE_HAS_THUMBNAIL context key when the node already provides a thumbnail. Thumbnail detection becomes preset-based (File.is_thumbnail, Node.has_thumbnail) instead of isinstance checks, so pipeline-generated thumbnails count. Node-level generation is extracted into generate_missing_thumbnail(), called from process_files for content nodes; topics defer to a new sequential ChannelManager.generate_deferred_thumbnails() post-pass, fixing the latent race where the concurrent executor could tile a topic before its children finished. Breaking change: the config.THUMBNAILS gate and derive_thumbnail node kwarg are removed - generation is now always on for nodes without a provided thumbnail. The --thumbnails CLI flag and the thumbnails / generate-missing-thumbnails settings remain as deprecated no-ops that emit a warning. Also: write_file no longer masks in-flight exceptions or copies partial files to storage; create_image_from_pdf_page gains max_width to cap render size (downscale-only: pages that would render narrower than the cap are not upscaled, with page width read via PyPDF2); guard get_thumbnail_preset against kind-less nodes matching the channel_thumbnail preset. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
6bc18ce to
1a937be
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Thumbnails are now always generated for content nodes and topics that don't provide one. Generation is cached and low-cost, so it is no longer opt-in.
Breaking change: the
config.THUMBNAILSgate,--thumbnailsCLI flag,generate-missing-thumbnailssettings shim, andderive_thumbnailnode kwarg are removed. Chefs passingderive_thumbnailwill get aTypeError.Topic tiles are generated in a sequential post-pass after tree processing, fixing a latent race where the thread-pool executor could process a topic before its children's thumbnails existed.
References
Related (not fixed): #490 — topics whose children are
StudioContentNodes still get no tile, since their thumbnails exist only on Studio.Reviewer guidance
Test coverage:
tests/pipeline/test_thumbnails.py(new pipeline stage) andtests/test_thumbnails.py(node-level generation + deferred topic tiles). For an end-to-end check, run any chef with no flags — content nodes get generated thumbnails and topics get tiles; passthumbnail=...on a node and confirm it wins over generation.Areas deserving extra attention:
ricecooker/utils/pipeline/file_handler.py:136—write_fileno longer runs the empty-file check and storage copy when the body raises; this is framework-wide (all transfer/convert/thumbnail handlers). All 11 call sites were checked against the new semantics, but it has the widest blast radius in the PR.ricecooker/managers/tree.py:114— the deferred-thumbnail post-pass relies onall_nodesbeing children-first, runs sequentially after the concurrent executor, and registers tile files intofile_mapfor upload.ricecooker/classes/nodes.py:266—has_thumbnail()is now preset-based plus an explicitself.thumbnailcheck; this changes thumbnail-detection semantics for legacyFileclasses too (classes/files.py:105).AI usage
This PR was built with Claude Code: the design spec and implementation plan were drafted collaboratively and human-reviewed, implementation ran task-by-task from the plan (tests written first), and the diff went through three AI-assisted review rounds with fixes before commit. I reviewed the spec, plan, and diff at each checkpoint and verified the full test suite and lint locally.
🤖 Generated with Claude Code