shemilk/gemma-3-12b-merged-m-e-h

VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Feb 16, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The shemilk/gemma-3-12b-merged-m-e-h is a 12 billion parameter instruction-tuned causal language model developed by shemilk. This model is finetuned from unsloth/gemma-3-12b-it-unsloth-bnb-4bit and was trained using Unsloth and Huggingface's TRL library, enabling 2x faster training. It is designed for general language generation tasks, leveraging its Gemma 3 architecture and 32768 token context length.

Loading preview...

Model Overview

The shemilk/gemma-3-12b-merged-m-e-h is a 12 billion parameter instruction-tuned language model, developed by shemilk. It is based on the Gemma 3 architecture and was finetuned from the unsloth/gemma-3-12b-it-unsloth-bnb-4bit model. A key characteristic of this model's development is its training methodology, which utilized Unsloth and Huggingface's TRL library, resulting in a 2x speedup during the finetuning process.

Key Capabilities

  • Instruction Following: Designed to respond effectively to given instructions due to its instruction-tuned nature.
  • Efficient Training: Benefits from the Unsloth framework, which optimizes the finetuning process for speed.
  • Gemma 3 Architecture: Leverages the underlying capabilities of the Gemma 3 model family.

Good For

  • Applications requiring a 12B parameter model with efficient finetuning origins.
  • General text generation and instruction-based tasks.
  • Developers interested in models trained with Unsloth for faster iteration.