Name: micaebe/Qwen2.5-1.5B-Instruct-QwQ API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: micaebe

Overview

micaebe/Qwen2.5-1.5B-Instruct-QwQ is a 1.54 billion parameter instruction-tuned causal language model, fine-tuned from the Qwen2.5-1.5B-Instruct base model. It leverages the Qwen2.5 architecture, featuring transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings. The model has a substantial context length of 32,768 tokens for input and 8,192 tokens for generation.

Key Capabilities

Enhanced Mathematical Reasoning: Fine-tuned on approximately 20,000 samples from QwQ-32B-Preview, including math problems from GSM8k and MATH datasets, leading to improved performance in mathematical contexts.
General Reasoning: Shows better general reasoning capabilities compared to its base model.
Self-Correction: Exhibits some self-correction abilities, though these are noted to be more limited than in larger Qwen2.5 models (e.g., 3B and 7B versions).

Performance

Achieves 73.2% on the GSM8k test set (based on the first 27% of the dataset).

Good For

Applications requiring a compact model with improved mathematical and general reasoning.
Use cases where some degree of self-correction is beneficial, particularly in a 1.5B parameter size class.

Overview

Overview

Key Capabilities

Performance

Good For

Full Model Card (README)