zelk12/26_05_2025_Test_LazyMergekit_gemma-3-12B

VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:May 26, 2025License:gemmaArchitecture:Transformer Cold

The zelk12/26_05_2025_Test_LazyMergekit_gemma-3-12B is a 12 billion parameter language model created by zelk12, formed by merging zelk12/MT-Gen1-gemma-3-12B and zelk12/MT-gemma-3-12B using the DARE TIES method. This model leverages the Gemma-3 architecture and has a context length of 32768 tokens. It is designed for general text generation tasks, combining characteristics from its merged components.

Loading preview...

Overview

zelk12/26_05_2025_Test_LazyMergekit_gemma-3-12B is a 12 billion parameter language model developed by zelk12. This model is a product of a merge operation using LazyMergekit, combining the strengths of zelk12/MT-Gen1-gemma-3-12B and zelk12/MT-gemma-3-12B. The merge was performed using the dare_ties method with specific density and weight parameters for the contributing models, and normalization enabled.

Key Capabilities

  • Merged Architecture: Integrates features from two distinct Gemma-3-12B based models, potentially offering a broader range of capabilities than its individual components.
  • General Text Generation: Suitable for various text generation tasks, leveraging the underlying Gemma-3 architecture.
  • Extended Context Window: Supports a context length of 32768 tokens, allowing for processing and generating longer sequences of text.

Usage

This model can be easily integrated into Python projects using the transformers library. The provided example demonstrates how to load the model and tokenizer, apply a chat template, and generate text using a pipeline, making it straightforward for developers to get started with text generation tasks.