dwnmf/gemma_3_4b_opus_distilled

Hugging Face
VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Mar 20, 2026Architecture:Transformer0.0K Warm

The dwnmf/gemma_3_4b_opus_distilled model is a 4.3 billion parameter language model, fine-tuned and converted to GGUF format by dwnmf using Unsloth. This model is based on the Gemma architecture and supports a 32768 token context length. It is notable for its GGUF compatibility and includes specific configurations for both text-only and multimodal applications, with a particular focus on vision model integration for platforms like Ollama.

Loading preview...

Overview

dwnmf/gemma_3_4b_opus_distilled is a 4.3 billion parameter model, fine-tuned and converted into the GGUF format by dwnmf, leveraging the Unsloth framework for faster training. This model is designed for efficient deployment and use with llama-cli for text-only tasks and llama-mtmd-cli for multimodal applications.

Key Capabilities

  • GGUF Compatibility: Provided in GGUF format, making it suitable for local inference with tools like llama-cli.
  • Multimodal Support: Includes configurations for multimodal use, specifically with BF16-mmproj.gguf for vision tasks.
  • Ollama Integration: Specific instructions are provided for creating unified BF16 models for use with Ollama, addressing its current lack of separate mmproj file support.
  • Optimized Training: Benefits from Unsloth's optimizations, enabling 2x faster training.

Good for

  • Developers seeking a Gemma-based model in GGUF format for local deployment.
  • Applications requiring a 4.3B parameter model with a 32768 token context for both text and vision tasks.
  • Users looking to integrate a vision-capable model with Ollama, following the provided conversion steps.