PARTAGES-dev/Qwen3-0.6B-PDAPT-SLERP
PARTAGES-dev/Qwen3-0.6B-PDAPT-SLERP is an 0.8 billion parameter language model based on the Qwen3 architecture, created by PARTAGES-dev through a SLERP merge. This model integrates a fine-tuned Qwen3-0.6B-Base-PARTAGES-v2-2160 with the original Qwen3-0.6B-Base, leveraging a 32768 token context length. It is designed for general language tasks, benefiting from the combined strengths of its merged components.
Loading preview...
Model Overview
PARTAGES-dev/Qwen3-0.6B-PDAPT-SLERP is an 0.8 billion parameter language model derived from the Qwen3 architecture. This model was created using the SLERP (Spherical Linear Interpolation) merge method, combining two distinct Qwen3-0.6B-Base variants.
Merge Details
The model integrates:
- A specialized version:
/home/mrim/manniona/partages/models/share/Qwen3-0.6B-Base-PARTAGES-v2-2160 - The foundational model:
/home/mrim/manniona/partages/models/hf-dl/Qwen/Qwen3-0.6B-Base
The merge process utilized mergekit with a specific YAML configuration, applying a t value of 0.5 for the SLERP interpolation across all 28 layers of both models. The resulting model maintains a substantial 32768 token context length.
Key Characteristics
- Architecture: Qwen3-based, 0.8 billion parameters.
- Merge Method: SLERP, combining two Qwen3-0.6B-Base models.
- Context Length: Supports a 32768 token context window.
Potential Use Cases
This model is suitable for applications requiring a compact yet capable language model, potentially benefiting from the specific adaptations present in the PARTAGES-v2-2160 component while retaining the robust base capabilities of Qwen3-0.6B.