Name: UWNSL/Qwen2.5-3B-Instruct_Short_CoT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: UWNSL

Model Overview

UWNSL/Qwen2.5-3B-Instruct_Short_CoT is a 3.1 billion parameter instruction-tuned model, derived from the Qwen/Qwen2.5-3B-Instruct architecture. This iteration has undergone specific fine-tuning on the MATH_training_Qwen2.5-32B-Instruct dataset, indicating a specialization in mathematical problem-solving and reasoning.

Key Characteristics

Base Model: Fine-tuned from Qwen/Qwen2.5-3B-Instruct.
Parameter Count: 3.1 billion parameters.
Context Length: Supports a context length of 32768 tokens.
Training Focus: Optimized for mathematical tasks, as evidenced by its training dataset.
Performance: Achieved a validation loss of 0.1360 during training, suggesting effective learning within its specialized domain.

Intended Use Cases

This model is particularly suitable for applications requiring:

Mathematical Reasoning: Solving complex math problems or generating mathematical explanations.
Instruction Following: Executing instructions related to numerical or logical tasks.
Specialized NLP: Tasks where a strong understanding of mathematical concepts is beneficial.

Training Details

The model was trained with a learning rate of 1e-05, using an AdamW optimizer and a cosine learning rate scheduler over 2 epochs. The training involved a total batch size of 16 across 4 GPUs.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Training Details

Full Model Card (README)