UWNSL/Qwen2.5-1.5B-Instruct_Long_CoT
UWNSL/Qwen2.5-1.5B-Instruct_Long_CoT is a 1.5 billion parameter instruction-tuned causal language model, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. This model is specifically optimized for mathematical reasoning tasks, leveraging a 32768 token context length. It is designed for applications requiring robust performance in quantitative problem-solving.
Loading preview...
Overview
UWNSL/Qwen2.5-1.5B-Instruct_Long_CoT is a specialized instruction-tuned language model, built upon the Qwen/Qwen2.5-1.5B-Instruct architecture. With 1.5 billion parameters and a substantial 32768 token context window, this model has undergone further fine-tuning on a mathematical dataset, indicated by its origin from "MATH_training_Qwen_QwQ_32B_Preview".
Key Characteristics
- Base Model: Qwen/Qwen2.5-1.5B-Instruct
- Parameter Count: 1.5 billion
- Context Length: 32768 tokens
- Fine-tuning Focus: Mathematical reasoning tasks, as suggested by the training dataset name.
- Training Performance: Achieved a validation loss of 0.3632 over 2 epochs, demonstrating effective learning on the target dataset.
Intended Use Cases
This model is particularly suited for scenarios demanding strong performance in mathematical problem-solving and reasoning. Its extended context length could be beneficial for complex, multi-step mathematical problems or tasks requiring detailed contextual understanding.