Mojo7/Katkut-3B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Feb 13, 2026Architecture:Transformer0.0K Warm

Mojo7/Katkut-3B is a merged language model created by Mojo7 using the SLERP method, combining Mojo7/Katkut-3B and Qwen/Qwen2.5-3B-Instruct. This 3-billion parameter model leverages the strengths of both base models, aiming for a balanced performance. It is designed for general language tasks, benefiting from the logical reasoning of Qwen and the specific characteristics of Katkut.

Loading preview...

Overview

Mojo7/Katkut-3B is a merged language model developed by Mojo7, created using the SLERP (Spherical Linear Interpolation) merge method. This model combines the characteristics of two distinct base models: Mojo7/Katkut-3B and Qwen/Qwen2.5-3B-Instruct.

Merge Details

The merge process involved combining layers from both models across a range of [0, 28]. The base_model for the merge was Qwen/Qwen2.5-3B-Instruct, indicating its foundational role in the resulting architecture. Specific parameters for the t value were applied differently to self_attn and mlp layers, suggesting a fine-tuned approach to balancing the contributions of each base model.

Key Characteristics

  • Hybrid Architecture: Benefits from the combined strengths of Mojo7/Katkut-3B and Qwen/Qwen2.5-3B-Instruct.
  • SLERP Method: Utilizes a sophisticated merging technique to blend model weights effectively.
  • Parameter Blending: Specific weighting applied to attention and MLP layers to optimize performance.

Potential Use Cases

This merged model is suitable for general-purpose language generation and understanding tasks where a balance of reasoning and specific stylistic elements from its constituent models is desired. It can be particularly useful for applications requiring a blend of logical coherence and unique linguistic patterns.