Name: TheTsar1209/qwen-carpmuscle-r-v0.3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheTsar1209

Model Overview

TheTsar1209/qwen-carpmuscle-r-v0.3 is a 14.8 billion parameter language model developed by TheTsar1209. It is built upon the Qwen2.5-14B and Qwen2.5-14B-Instruct base models, utilizing a unique merging strategy. The model was created using Rombodawg's Shared Continuous Finetuning method, which involved continuous pretraining on the ChatML format with a 24k context using Unsloth's optimized Qwen2.5-14B-bnb-4bit, followed by a TIES merge with the instruct and base Qwen2.5-14B models.

Key Characteristics

Architecture: Based on the Qwen2.5 family, leveraging both base and instruct variants.
Merging Technique: Employs the TIES (Trimmed, Iterative, and Efficient Merging of Experts) method via mergekit to combine different model checkpoints.
Multilingual Support: Capable of handling text generation in numerous languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic.
Training Optimization: The underlying qwen-carpmuscle-v0.3 component was trained 2x faster using Unsloth and Huggingface's TRL library.

Performance Highlights

Evaluated on the Open LLM Leaderboard, the model shows:

IFEval (0-Shot): 44.55 strict accuracy
BBH (3-Shot): 46.38 normalized accuracy
MMLU-PRO (5-shot): 45.59 accuracy

Use Cases

This model is suitable for general text generation tasks where a blend of capabilities from base and instruction-tuned Qwen2.5 models is desired, particularly in multilingual contexts. Its merging methodology suggests an attempt to combine the strengths of different Qwen2.5 variants.

Overview

Model Overview

Key Characteristics

Performance Highlights

Use Cases

Full Model Card (README)