Cooolder/SCOPE-CoT-sft
Cooolder/SCOPE-CoT-sft is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B, designed for specific tasks related to the scope_kshot_format dataset. This model leverages the Qwen3 architecture with a 32K context length, optimized through supervised fine-tuning. It aims to provide specialized performance on tasks aligned with its training data, demonstrating a validation loss of 0.5126.
Loading preview...
Model Overview
Cooolder/SCOPE-CoT-sft is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model has been specifically trained on the scope_kshot_format dataset, indicating a specialization for tasks related to this data. It maintains the Qwen3-8B's 32,768 token context length.
Training Details
The model underwent supervised fine-tuning (SFT) over 2 epochs using a multi-GPU setup (2 devices). Key hyperparameters included a learning rate of 2e-05, a total training batch size of 16 (with gradient accumulation steps of 8), and the AdamW_TORCH optimizer. The training process achieved a final validation loss of 0.5126, with progressive loss reduction observed across 7000 steps.
Potential Use Cases
Given its fine-tuning on the scope_kshot_format dataset, this model is likely best suited for applications that align closely with the characteristics and structure of that specific data. Developers should evaluate its performance on tasks requiring specialized understanding or generation capabilities derived from its training domain.