GreatGoose/Qwen2.5-3B-Instruct-full-loglm
GreatGoose/Qwen2.5-3B-Instruct-full-loglm is a 3.1 billion parameter instruction-tuned causal language model, fine-tuned from the Qwen/Qwen2.5-3B-Instruct architecture. This model is further optimized through additional training, building upon the base Qwen2.5-3B-Instruct capabilities. It is suitable for general instruction-following tasks where a compact yet capable model is required.
Loading preview...
Model Overview
This model, GreatGoose/Qwen2.5-3B-Instruct-full-loglm, is a fine-tuned iteration of the Qwen/Qwen2.5-3B-Instruct base model. It leverages the 3.1 billion parameter Qwen2.5 architecture, which is known for its efficiency and performance in its size class. The fine-tuning process involved specific training hyperparameters, including a learning rate of 1e-05 and a total training batch size of 16 across two GPUs, for 3 epochs.
Training Details
- Base Model: Qwen/Qwen2.5-3B-Instruct
- Learning Rate: 1e-05
- Optimizer: AdamW_Torch_Fused with default betas and epsilon
- Epochs: 3.0
- Frameworks: Transformers 4.57.1, PyTorch 2.9.1+cu128, Datasets 4.0.0, Tokenizers 0.22.2
Intended Use Cases
This model is designed for general instruction-following applications, benefiting from its compact size and fine-tuned nature. It is suitable for scenarios requiring a capable language model with moderate computational resources.