Haicaochi/Qwen_05_txtt_V2_Stable

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Nov 2, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

Haicaochi/Qwen_05_txtt_V2_Stable is a 0.5 billion parameter causal language model, fine-tuned from Qwen/Qwen2.5-0.5B-Instruct. This model is optimized for text generation tasks, demonstrating a validation loss of 0.3150. It is suitable for applications requiring a compact yet capable instruction-following model.

Loading preview...

Model Overview

Haicaochi/Qwen_05_txtt_V2_Stable is a 0.5 billion parameter instruction-tuned language model, derived from the Qwen/Qwen2.5-0.5B-Instruct architecture. This version has undergone further fine-tuning, achieving a validation loss of 0.3150 during its training process.

Training Details

The model was trained with a learning rate of 5e-05, using an Adam optimizer with betas=(0.9, 0.999) and epsilon=1e-08. Training involved a batch size of 8, with a gradient accumulation of 16 steps, leading to an effective total batch size of 128. A cosine learning rate scheduler with a 0.1 warmup ratio was employed over 20 epochs. Key training results show a progressive decrease in validation loss across epochs.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen2.5-0.5B-Instruct.
  • Parameter Count: 0.5 billion parameters.
  • Context Length: Supports a context length of 32768 tokens.
  • Performance: Achieved a final validation loss of 0.3150.

Potential Use Cases

Given its compact size and instruction-tuned nature, this model is suitable for applications where resource efficiency is important, such as:

  • Lightweight text generation.
  • Instruction-following tasks on constrained environments.
  • Rapid prototyping for language-based applications.