axel-datos/qwen2.5-0.5b-instruct_gsm8k_full-finetuningV2
axel-datos/qwen2.5-0.5b-instruct_gsm8k_full-finetuningV2 is a fine-tuned version of the Qwen/qwen2.5-0.5b-instruct model. This model has been adapted using a customized dataset, suggesting a specialization beyond its base instruction-following capabilities. It is intended for tasks aligned with its specific fine-tuning, though further details on its exact capabilities and limitations are not provided.
Loading preview...
Model Overview
This model, axel-datos/qwen2.5-0.5b-instruct_gsm8k_full-finetuningV2, is a fine-tuned variant of the Qwen/qwen2.5-0.5b-instruct base model. It has undergone further training on a customized dataset, indicating an optimization for specific tasks or domains not covered by the original instruction-tuned model.
Training Details
The fine-tuning process utilized the following key hyperparameters:
- Learning Rate: 2e-05
- Batch Sizes:
train_batch_sizeof 1,eval_batch_sizeof 8 - Optimizer: AdamW with
betas=(0.9, 0.999)andepsilon=1e-08 - Scheduler: Linear learning rate scheduler
- Epochs: Trained for 0.01 epochs
- Precision: Native AMP for mixed-precision training
Framework Versions
The training environment included:
- Transformers 4.46.3
- PyTorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.20.3
Intended Use
While specific intended uses and limitations are not detailed, the model's fine-tuning on a custom dataset suggests it is tailored for particular applications. Users should evaluate its performance on their specific tasks, especially those related to the undisclosed custom dataset.