lihaoxin2020/qwen3-4b-refiner-gpt54-ep3

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 18, 2026License:otherArchitecture:Transformer Cold

The lihaoxin2020/qwen3-4b-refiner-gpt54-ep3 model is a 4 billion parameter language model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507. This model has been specialized using the refiner_gpt54_sft dataset, indicating an optimization for refinement tasks. It is designed for applications requiring enhanced instruction following or specific text generation capabilities derived from its fine-tuning data.

Loading preview...

Model Overview

The lihaoxin2020/qwen3-4b-refiner-gpt54-ep3 is a 4 billion parameter language model, building upon the Qwen3-4B-Instruct-2507 architecture. This model has undergone specific fine-tuning on the refiner_gpt54_sft dataset, suggesting its development was aimed at improving performance in refinement-oriented tasks or specific instruction-following scenarios.

Key Training Details

During its training, the model utilized a learning rate of 5e-06 with a batch size of 2, accumulating gradients over 8 steps for an effective total batch size of 32. It was trained for 3 epochs using a cosine learning rate scheduler with a 0.05 warmup ratio. The training leveraged a multi-GPU setup with 2 devices, employing the AdamW_TORCH optimizer.

Potential Use Cases

Given its fine-tuning on a 'refiner' dataset, this model is likely suitable for:

  • Text Refinement: Tasks involving improving the quality, coherence, or style of existing text.
  • Instruction Following: Applications where precise adherence to given instructions is critical.
  • Specialized Generation: Generating text that aligns with patterns or styles present in the refiner_gpt54_sft dataset.

Further details on specific intended uses and limitations would require more information about the refiner_gpt54_sft dataset and its characteristics.