Cooolder/SCOPE-CoT-sft-v2
Cooolder/SCOPE-CoT-sft-v2 is a 4 billion parameter language model fine-tuned from Qwen/Qwen3-4B-Instruct-2507. This model is specifically optimized for Chain-of-Thought (CoT) reasoning tasks, leveraging a specialized dataset for supervised fine-tuning. It aims to enhance the model's ability to generate step-by-step reasoning processes, making it suitable for complex problem-solving applications.
Loading preview...
Overview
Cooolder/SCOPE-CoT-sft-v2 is a 4 billion parameter language model, building upon the Qwen3-4B-Instruct-2507 architecture. It has undergone supervised fine-tuning (SFT) using the scope_sft_cot dataset, specifically designed to improve its Chain-of-Thought (CoT) reasoning capabilities. The training process involved a learning rate of 1e-05 over 1.0 epoch, utilizing an AdamW optimizer with cosine learning rate scheduling.
Key Capabilities
- Enhanced Chain-of-Thought Reasoning: Fine-tuned to generate more structured and logical step-by-step reasoning processes.
- Foundation Model: Based on the Qwen3-4B-Instruct-2507, inheriting its general language understanding and generation abilities.
Training Details
The model was trained with a batch size of 1 and gradient accumulation steps of 8, resulting in an effective total batch size of 8. The training concluded with a validation loss of 0.3770. The training hyperparameters included a learning rate of 1e-05 and a cosine learning rate scheduler with a warmup ratio of 0.1.
Intended Uses
This model is particularly suited for applications requiring explicit reasoning steps, such as:
- Complex question answering
- Logical problem-solving
- Educational tools that explain solutions step-by-step
Limitations
The current model card indicates that more information is needed regarding its specific limitations and broader intended uses, suggesting further evaluation and documentation are required.