ShenaoZhang/0.001_idpo_noreplacerej_iter_2
ShenaoZhang/0.001_idpo_noreplacerej_iter_2 is a 7 billion parameter language model, fine-tuned from ShenaoZhang/0.001_idpo_noreplacerej_iter_1 on the ShenaoZhang/0.001_idpo_noreplacerej_dataset. This model was trained for one epoch with a learning rate of 5e-07 and a total batch size of 128, utilizing a multi-GPU setup. Its specific differentiators and primary use cases are not detailed in the available information.
Loading preview...
Model Overview
ShenaoZhang/0.001_idpo_noreplacerej_iter_2 is a 7 billion parameter language model, representing a fine-tuned iteration of the previously released ShenaoZhang/0.001_idpo_noreplacerej_iter_1. The fine-tuning process was conducted using the ShenaoZhang/001_idpo_noreplacerej_dataset.
Training Details
The model underwent a single epoch of training with specific hyperparameters:
- Learning Rate: 5e-07
- Batch Sizes: A training batch size of 8 and an evaluation batch size of 8 were used, leading to a total effective training batch size of 128 and evaluation batch size of 64, achieved through gradient accumulation steps of 2.
- Optimizer: Adam with default betas and epsilon.
- Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
- Hardware: Training was distributed across 8 GPUs.
Current Status
Detailed information regarding the model's specific capabilities, intended uses, limitations, and performance evaluation data is not yet available in the provided documentation. Users are encouraged to consult future updates for more comprehensive insights into its applications and performance characteristics.