Add Box shared link download handler#675
Draft
rtibbles wants to merge 1 commit into
Draft
Conversation
Adds a BoxHandler to the transfer stage that resolves Box shared link URLs via the Box API (client credentials grant) and downloads the underlying file. Credentials are supplied through the BOX_CLIENT_ID, BOX_CLIENT_SECRET, and BOX_ENTERPRISE_ID environment variables, and the SDK dependency is available as the 'box' extra. Happy paths (pdf, video, vtt, audio) are covered by VCR cassette tests recorded against the live Box API; the cassette config scrubs credentials, access tokens, and account identity from the recordings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
BoxHandlerto the transfer stage that resolves Box shared-link URLs via the Box API (client credentials grant) and downloads the underlying file. Credentials are supplied through theBOX_CLIENT_ID,BOX_CLIENT_SECRET, andBOX_ENTERPRISE_IDenvironment variables, and the SDK dependency is available as theboxextra.Happy paths (pdf, video, vtt, audio) are covered by VCR cassette tests recorded against the live Box API.
References
Box shared-link handling complements the existing
GoogleDriveHandlerfor the other major cloud-storage source encountered in channel sources. Test files live in this Learning Equality Box folder.Reviewer guidance
uv run --group test pytest tests/pipeline/test_transfer.py -k box. To re-record against the live API, deletetests/cassettes/test_box_*.yamland run with realBOX_*env vars.box_vcrintests/vcr_config.pystrips the auth header and credential POST params, and_scrub_box_responsereplaces access tokens and account identity (created_by/modified_by/owned_by) withFILTERED. Grepping the cassettes for tokens/credentials/identity should come up empty.no_requests_cachefixture exists becausetests/media_utils/modules install requests-cache globally at import time, which breaks VCR replay forrequests-based clients — without it the Box tests fail when the full suite runs but pass in isolation.BoxHandler.HANDLED_EXCEPTIONSand the lazy client import inricecooker/utils/pipeline/transfer.pykeep theboxextra optional — worth checking the no-SDK-installed path.AI usage
I used Claude Code to implement the handler, the VCR test/scrubbing setup, and to diagnose a test-isolation failure with the globally installed requests-cache. I recorded the cassettes myself, verified no credentials or account identity survived in them, and reviewed the full diff before pushing.
🤖 Generated with Claude Code