modelopt

Here are 5 public repositories matching this topic...

AEON-7 / supergemma4-26b-abliterated-multimodal-nvfp4

NVFP4 AWQ Full quantization of SuperGemma4-26B-Abliterated-Multimodal for Blackwell GPUs — pre-built vLLM container + patches included

moe quantization multimodal blackwell awq llm vllm nvfp4 dgx-spark gemma4 modelopt

Updated May 1, 2026
Python

AImindPalace / dgx-spark-nvfp4-serving

Star

Guide for serving fine-tuned Qwen3.5-27B (dense, NVFP4) on DGX Spark via native vLLM. Includes critical config fixes for modelopt export_hf_checkpoint() that prevent silent FP32 dequantization.

quantization fine-tuning llama-cpp vllm llm-inference qwen speculative-decoding nvfp4 dgx-spark modelopt

Updated Apr 21, 2026
Python

lna-lab / GGUF-to-NVFP4-SM120

Star

Lna-Lab production pipeline: GGUF -> modelopt-format NVFP4 + working MTP head for vLLM on RTX PRO 6000 Blackwell (SM120). Stages 2 (NVFP4) and 3 (MTP graft) are Lna-Lab originals; stage 1 (GGUF->bf16) reuses li-yifei/gguf-to-nvfp4.

quantization mtp blackwell vllm gguf qwen3 sm120 nvfp4 modelopt

Updated Apr 27, 2026
Python

AEON-7 / modelopt-fast-moe

Star

nvidia calibration moe quantization gemma mixture-of-experts awq llm nvfp4 modelopt

Updated May 1, 2026
Python

AEON-7 / Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4

Star

EAGLE E4B speculative decoding drafter for Gemma 4 31B DECKARD HERETIC Uncensored NVFP4 — optimized for NVIDIA DGX Spark

eagle drafter blackwell awq vllm speculative-decoding nvfp4 dgx-spark gemma4 modelopt

Updated May 1, 2026

Improve this page

Add a description, image, and links to the modelopt topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the modelopt topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modelopt

Here are 5 public repositories matching this topic...

AEON-7 / supergemma4-26b-abliterated-multimodal-nvfp4

AImindPalace / dgx-spark-nvfp4-serving

lna-lab / GGUF-to-NVFP4-SM120

AEON-7 / modelopt-fast-moe

AEON-7 / Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4

Improve this page

Add this topic to your repo