data: Llama 4 Scout BF16 shard 5 → bgz17 (18.2 GB → 7.7 MB, 4735×) Streamed from HuggingFace via HTTP range reader. Zero disk for source. MoE expert FFN: 15,420× compression. Shared expert: 964×. Attention: 2,162×. Full model estimate: ~215 GB BF16 → ~40 MB bgz7. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7 by AdaWorldAPI · Pull Request #49 · AdaWorldAPI/ndarray

AdaWorldAPI · 2026-03-30T00:50:34Z

No description provided.

HttpRangeReader implements Read + Seek over HTTP via curl range requests. Enables streaming GGUF indexing from HuggingFace without disk copy. 8 MB chunked buffering, resolve_hf_url helper for HF metadata. Llama 4 Scout integration test streams IQ1_S (32.5 GB) directly from HF. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

Streams 18.2 GB BF16 shard directly from HuggingFace via HTTP range reader. Zero disk usage for source GGUF. Validates BF16 dequant path and MoE tensor handling on real Llama 4 weights. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

Replace scalar bf16_to_f32 loop with quantized::bf16_to_f32_slice batch path. Same BF16 repr (transparent u16), zero-copy reinterpret of raw bytes to BF16 slice, then batch convert to f32. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

Fewer HTTP round-trips: 18 GB shard = ~72 requests instead of ~1125. 256 MB fits comfortably in RAM alongside the dequantized tensor. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

Streamed from HuggingFace via HTTP range reader. Zero disk for source. MoE expert FFN: 15,420× compression. Shared expert: 964×. Attention: 2,162×. Full model estimate: ~215 GB BF16 → ~40 MB bgz7. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

claude added 5 commits March 30, 2026 00:20

perf: 256 MB HTTP chunks for streaming GGUF indexing

cb8d7a3

Fewer HTTP round-trips: 18 GB shard = ~72 requests instead of ~1125. 256 MB fits comfortably in RAM alongside the dequantized tensor. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

AdaWorldAPI merged commit 6cdfa9b into master Mar 30, 2026
4 of 10 checks passed

AdaWorldAPI mentioned this pull request May 13, 2026

Doctest fixes (from #139) + unblock i686 cross-tests + thumbv6m nostd build #141

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AdaWorldAPI commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants