ShenaoZhang/0.001_idpo_iter_2
The ShenaoZhang/0.001_idpo_iter_2 model is a fine-tuned iteration building upon ShenaoZhang/0.001_idpo_iter_1, developed by ShenaoZhang. It was trained using specific hyperparameters including a learning rate of 5e-07 and a total batch size of 128 over 1 epoch. This model is part of an iterative development process, with its primary differentiation stemming from its fine-tuning on the ShenaoZhang/0.001_idpo_dataset. Its specific capabilities and intended uses require further information for precise application.
Loading preview...
Model Overview
ShenaoZhang/0.001_idpo_iter_2 is an iteratively fine-tuned language model developed by ShenaoZhang. It is a direct successor to ShenaoZhang/0.001_idpo_iter_1, having been fine-tuned on the ShenaoZhang/0.001_idpo_dataset.
Training Details
The model underwent a single training epoch with a learning rate of 5e-07. Key training hyperparameters included:
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Batch Sizes:
train_batch_sizeof 8,eval_batch_sizeof 8, leading to atotal_train_batch_sizeof 128 andtotal_eval_batch_sizeof 64 (with 8 devices and 2 gradient accumulation steps). - Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1.
- Seed: 42 for reproducibility.
Framework Versions
The training environment utilized:
- Transformers 4.36.2
- Pytorch 2.1.2+cu121
- Datasets 2.14.6
- Tokenizers 0.15.2
Current Status
Further information regarding the model's specific capabilities, intended uses, limitations, and evaluation results is currently pending. Users are advised that detailed performance metrics and specific application guidance are not yet available.