dwnmf/gemma_3_4b_opus_distilled
The dwnmf/gemma_3_4b_opus_distilled model is a 4.3 billion parameter language model, fine-tuned and converted to GGUF format by dwnmf using Unsloth. This model is based on the Gemma architecture and supports a 32768 token context length. It is notable for its GGUF compatibility and includes specific configurations for both text-only and multimodal applications, with a particular focus on vision model integration for platforms like Ollama.
Loading preview...
Overview
dwnmf/gemma_3_4b_opus_distilled is a 4.3 billion parameter model, fine-tuned and converted into the GGUF format by dwnmf, leveraging the Unsloth framework for faster training. This model is designed for efficient deployment and use with llama-cli for text-only tasks and llama-mtmd-cli for multimodal applications.
Key Capabilities
- GGUF Compatibility: Provided in GGUF format, making it suitable for local inference with tools like
llama-cli. - Multimodal Support: Includes configurations for multimodal use, specifically with
BF16-mmproj.gguffor vision tasks. - Ollama Integration: Specific instructions are provided for creating unified BF16 models for use with Ollama, addressing its current lack of separate
mmprojfile support. - Optimized Training: Benefits from Unsloth's optimizations, enabling 2x faster training.
Good for
- Developers seeking a Gemma-based model in GGUF format for local deployment.
- Applications requiring a 4.3B parameter model with a 32768 token context for both text and vision tasks.
- Users looking to integrate a vision-capable model with Ollama, following the provided conversion steps.