Model Overview
PARTAGES-dev/Qwen3-1.7B-PDAPT-SLERP is a 2 billion parameter language model derived from the Qwen3-1.7B-Base architecture. It was created using the SLERP (Spherical Linear Interpolation) merge method, a technique designed to combine the weights of multiple pre-trained models while preserving their individual characteristics.
Merge Details
This model is a composite of two distinct base models:
- Qwen/Qwen3-1.7B-Base: A foundational model from the Qwen series.
- An undisclosed model: The specific identity of the second merged model is not publicly specified in the configuration.
The merge process involved combining the weights across all 28 layers of both source models, with a t parameter value of 0.5 indicating an equal weighting during the SLERP interpolation. The model was configured to use bfloat16 for its data type.
Potential Use Cases
Given its foundation on the Qwen3-1.7B-Base and the nature of model merging, this model could be suitable for:
- General text generation: Creating coherent and contextually relevant text.
- Language understanding tasks: Processing and interpreting natural language inputs.
- Further fine-tuning: Serving as a robust base for domain-specific adaptations or instruction tuning, leveraging the combined knowledge of its merged components.
- Applications requiring longer contexts: Its 32768 token context length allows for handling extensive documents or conversational histories.