GreatGoose/Qwen2.5-3B-Instruct-full-loglm

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Jan 14, 2026License:otherArchitecture:Transformer Warm

GreatGoose/Qwen2.5-3B-Instruct-full-loglm is a 3.1 billion parameter instruction-tuned causal language model, fine-tuned from the Qwen/Qwen2.5-3B-Instruct architecture. This model is further optimized through additional training, building upon the base Qwen2.5-3B-Instruct capabilities. It is suitable for general instruction-following tasks where a compact yet capable model is required.

Loading preview...

Model Overview

This model, GreatGoose/Qwen2.5-3B-Instruct-full-loglm, is a fine-tuned iteration of the Qwen/Qwen2.5-3B-Instruct base model. It leverages the 3.1 billion parameter Qwen2.5 architecture, which is known for its efficiency and performance in its size class. The fine-tuning process involved specific training hyperparameters, including a learning rate of 1e-05 and a total training batch size of 16 across two GPUs, for 3 epochs.

Training Details

  • Base Model: Qwen/Qwen2.5-3B-Instruct
  • Learning Rate: 1e-05
  • Optimizer: AdamW_Torch_Fused with default betas and epsilon
  • Epochs: 3.0
  • Frameworks: Transformers 4.57.1, PyTorch 2.9.1+cu128, Datasets 4.0.0, Tokenizers 0.22.2

Intended Use Cases

This model is designed for general instruction-following applications, benefiting from its compact size and fine-tuned nature. It is suitable for scenarios requiring a capable language model with moderate computational resources.