Overview
moogician/DSR1-Qwen-32B-131fad2c is a 32 billion parameter language model derived from the deepseek-ai/DeepSeek-R1-Distill-Qwen-32B base model. It has been fine-tuned on the fc_rlm dataset, indicating a specialized training focus that likely enhances its performance on tasks related to that specific data distribution. The model supports a context length of 32768 tokens, allowing for the processing of long and complex sequences.
Key Training Details
- Base Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
- Fine-tuning Dataset:
fc_rlm - Learning Rate: 1e-05
- Batch Size: 2 (train), 8 (eval) with 4 gradient accumulation steps, resulting in a total effective batch size of 64.
- Optimizer: AdamW with standard betas and epsilon.
- Scheduler: Cosine learning rate scheduler with 0.1 warmup ratio.
- Epochs: 4.0
Potential Use Cases
Given its fine-tuning on the fc_rlm dataset, this model is likely optimized for:
- Tasks requiring nuanced understanding or generation based on reinforcement learning from human feedback (RLHF) data.
- Applications where the specific characteristics of the
fc_rlm dataset are relevant for improved performance. - Scenarios benefiting from a 32B parameter model with a large 32K context window for detailed analysis or generation.