Skip to content

Add Box shared link download handler#675

Draft
rtibbles wants to merge 1 commit into
learningequality:mainfrom
rtibbles:box_handler
Draft

Add Box shared link download handler#675
rtibbles wants to merge 1 commit into
learningequality:mainfrom
rtibbles:box_handler

Conversation

@rtibbles

@rtibbles rtibbles commented Jun 5, 2026

Copy link
Copy Markdown
Member

Summary

Adds a BoxHandler to the transfer stage that resolves Box shared-link URLs via the Box API (client credentials grant) and downloads the underlying file. Credentials are supplied through the BOX_CLIENT_ID, BOX_CLIENT_SECRET, and BOX_ENTERPRISE_ID environment variables, and the SDK dependency is available as the box extra.

Happy paths (pdf, video, vtt, audio) are covered by VCR cassette tests recorded against the live Box API.

References

Box shared-link handling complements the existing GoogleDriveHandler for the other major cloud-storage source encountered in channel sources. Test files live in this Learning Equality Box folder.

Reviewer guidance

  • All tests replay from committed cassettes with no credentials: uv run --group test pytest tests/pipeline/test_transfer.py -k box. To re-record against the live API, delete tests/cassettes/test_box_*.yaml and run with real BOX_* env vars.
  • Cassette hygiene is the main thing to double-check: box_vcr in tests/vcr_config.py strips the auth header and credential POST params, and _scrub_box_response replaces access tokens and account identity (created_by/modified_by/owned_by) with FILTERED. Grepping the cassettes for tokens/credentials/identity should come up empty.
  • The no_requests_cache fixture exists because tests/media_utils/ modules install requests-cache globally at import time, which breaks VCR replay for requests-based clients — without it the Box tests fail when the full suite runs but pass in isolation.
  • BoxHandler.HANDLED_EXCEPTIONS and the lazy client import in ricecooker/utils/pipeline/transfer.py keep the box extra optional — worth checking the no-SDK-installed path.

AI usage

I used Claude Code to implement the handler, the VCR test/scrubbing setup, and to diagnose a test-isolation failure with the globally installed requests-cache. I recorded the cassettes myself, verified no credentials or account identity survived in them, and reviewed the full diff before pushing.

🤖 Generated with Claude Code

Adds a BoxHandler to the transfer stage that resolves Box shared link
URLs via the Box API (client credentials grant) and downloads the
underlying file. Credentials are supplied through the BOX_CLIENT_ID,
BOX_CLIENT_SECRET, and BOX_ENTERPRISE_ID environment variables, and the
SDK dependency is available as the 'box' extra.

Happy paths (pdf, video, vtt, audio) are covered by VCR cassette tests
recorded against the live Box API; the cassette config scrubs
credentials, access tokens, and account identity from the recordings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant