Name: Mojo7/Katkut-3B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Mojo7

Overview

Mojo7/Katkut-3B is a merged language model developed by Mojo7, created using the SLERP (Spherical Linear Interpolation) merge method. This model combines the characteristics of two distinct base models: Mojo7/Katkut-3B and Qwen/Qwen2.5-3B-Instruct.

Merge Details

The merge process involved combining layers from both models across a range of [0, 28]. The base_model for the merge was Qwen/Qwen2.5-3B-Instruct, indicating its foundational role in the resulting architecture. Specific parameters for the t value were applied differently to self_attn and mlp layers, suggesting a fine-tuned approach to balancing the contributions of each base model.

Key Characteristics

Hybrid Architecture: Benefits from the combined strengths of Mojo7/Katkut-3B and Qwen/Qwen2.5-3B-Instruct.
SLERP Method: Utilizes a sophisticated merging technique to blend model weights effectively.
Parameter Blending: Specific weighting applied to attention and MLP layers to optimize performance.

Potential Use Cases

This merged model is suitable for general-purpose language generation and understanding tasks where a balance of reasoning and specific stylistic elements from its constituent models is desired. It can be particularly useful for applications requiring a blend of logical coherence and unique linguistic patterns.

Overview

Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)