moogician/DSR1-Qwen-32B-131fad2c

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kLicense:otherArchitecture:Transformer Cold

moogician/DSR1-Qwen-32B-131fad2c is a 32 billion parameter language model, fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-32B. This model was specifically fine-tuned on the fc_rlm dataset, suggesting an optimization for tasks related to reinforcement learning from human feedback or similar data. It maintains a substantial context length of 32768 tokens, making it suitable for processing extensive inputs.

Loading preview...

Overview

moogician/DSR1-Qwen-32B-131fad2c is a 32 billion parameter language model derived from the deepseek-ai/DeepSeek-R1-Distill-Qwen-32B base model. It has been fine-tuned on the fc_rlm dataset, indicating a specialized training focus that likely enhances its performance on tasks related to that specific data distribution. The model supports a context length of 32768 tokens, allowing for the processing of long and complex sequences.

Key Training Details

  • Base Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
  • Fine-tuning Dataset: fc_rlm
  • Learning Rate: 1e-05
  • Batch Size: 2 (train), 8 (eval) with 4 gradient accumulation steps, resulting in a total effective batch size of 64.
  • Optimizer: AdamW with standard betas and epsilon.
  • Scheduler: Cosine learning rate scheduler with 0.1 warmup ratio.
  • Epochs: 4.0

Potential Use Cases

Given its fine-tuning on the fc_rlm dataset, this model is likely optimized for:

  • Tasks requiring nuanced understanding or generation based on reinforcement learning from human feedback (RLHF) data.
  • Applications where the specific characteristics of the fc_rlm dataset are relevant for improved performance.
  • Scenarios benefiting from a 32B parameter model with a large 32K context window for detailed analysis or generation.