PARTAGES-dev/Qwen3-0.6B-PDAPT-SLERP

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Dec 4, 2025Architecture:Transformer Warm

PARTAGES-dev/Qwen3-0.6B-PDAPT-SLERP is an 0.8 billion parameter language model based on the Qwen3 architecture, created by PARTAGES-dev through a SLERP merge. This model integrates a fine-tuned Qwen3-0.6B-Base-PARTAGES-v2-2160 with the original Qwen3-0.6B-Base, leveraging a 32768 token context length. It is designed for general language tasks, benefiting from the combined strengths of its merged components.

Loading preview...

Model Overview

PARTAGES-dev/Qwen3-0.6B-PDAPT-SLERP is an 0.8 billion parameter language model derived from the Qwen3 architecture. This model was created using the SLERP (Spherical Linear Interpolation) merge method, combining two distinct Qwen3-0.6B-Base variants.

Merge Details

The model integrates:

  • A specialized version: /home/mrim/manniona/partages/models/share/Qwen3-0.6B-Base-PARTAGES-v2-2160
  • The foundational model: /home/mrim/manniona/partages/models/hf-dl/Qwen/Qwen3-0.6B-Base

The merge process utilized mergekit with a specific YAML configuration, applying a t value of 0.5 for the SLERP interpolation across all 28 layers of both models. The resulting model maintains a substantial 32768 token context length.

Key Characteristics

  • Architecture: Qwen3-based, 0.8 billion parameters.
  • Merge Method: SLERP, combining two Qwen3-0.6B-Base models.
  • Context Length: Supports a 32768 token context window.

Potential Use Cases

This model is suitable for applications requiring a compact yet capable language model, potentially benefiting from the specific adaptations present in the PARTAGES-v2-2160 component while retaining the robust base capabilities of Qwen3-0.6B.