Alelcv27/Qwen2.5-3B-Arcee-Base-INST
Alelcv27/Qwen2.5-3B-Arcee-Base-INST is a 3.1 billion parameter language model based on the Qwen2.5-3B architecture, created by Alelcv27 using the Arcee Fusion merge method. This model integrates Qwen/Qwen2.5-3B-Instruct with Qwen/Qwen2.5-3B, leveraging a 32768 token context length. It is specifically designed as a merged base model, combining the strengths of its constituent Qwen 2.5 models for general language tasks.
Loading preview...
Model Overview
Alelcv27/Qwen2.5-3B-Arcee-Base-INST is a 3.1 billion parameter language model, a product of a merge operation using the Arcee Fusion method. This model is built upon the robust Qwen2.5-3B architecture, integrating components from both the base Qwen/Qwen2.5-3B and the instruction-tuned Qwen/Qwen2.5-3B-Instruct models.
Key Characteristics
- Architecture: Based on the Qwen2.5-3B family, known for its strong performance in its size class.
- Parameter Count: Features 3.1 billion parameters, offering a balance between performance and computational efficiency.
- Context Length: Supports a substantial context window of 32768 tokens, enabling processing of longer inputs and generating more coherent, extended outputs.
- Merge Method: Utilizes the Arcee Fusion technique via mergekit, combining specific layers from its constituent models to create a new, optimized base model.
Intended Use Cases
This model is suitable for developers and researchers looking for a merged base model that combines the general language understanding of Qwen2.5-3B with the instruction-following capabilities derived from Qwen2.5-3B-Instruct. It can serve as a foundation for further fine-tuning or for applications requiring a capable, medium-sized language model with a large context window.