varshak1/reproducing-openrubric-rubric-sft
The varshak1/reproducing-openrubric-rubric-sft model is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. It was trained on the reproducing-openrubric-rubric-sft dataset with a context length of 32768 tokens. This model is specifically fine-tuned for tasks related to rubric-based instruction following, leveraging its base architecture for specialized performance in this domain.
Loading preview...
Model Overview
This model, reproducing-openrubric-rubric-sft, is a fine-tuned version of the Qwen/Qwen3-8B base model. It has 8 billion parameters and supports a context length of 32768 tokens. The fine-tuning process utilized the reproducing-openrubric-rubric-sft dataset, indicating a specialization in tasks related to rubric generation or rubric-based instruction following.
Training Details
The model was trained using the following key hyperparameters:
- Learning Rate: 8e-06
- Batch Size: 4 (train), 8 (eval)
- Gradient Accumulation: 4 steps, leading to a total effective batch size of 128 for training.
- Optimizer: ADAMW_TORCH with standard betas and epsilon.
- Scheduler: Cosine learning rate scheduler with 0.05 warmup steps.
- Epochs: 1.0
Intended Use Cases
While specific use cases are not detailed in the README, its fine-tuning on a "rubric-sft" dataset suggests potential applications in:
- Generating rubrics for various tasks.
- Assisting with grading or evaluation based on provided rubrics.
- Understanding and responding to instructions framed within a rubric structure.
Limitations
The README indicates that more information is needed regarding its specific limitations and intended uses, suggesting users should perform their own evaluations for critical applications.