Haicaochi/Qwen_05_txtt_V2_Stable
Haicaochi/Qwen_05_txtt_V2_Stable is a 0.5 billion parameter causal language model, fine-tuned from Qwen/Qwen2.5-0.5B-Instruct. This model is optimized for text generation tasks, demonstrating a validation loss of 0.3150. It is suitable for applications requiring a compact yet capable instruction-following model.
Loading preview...
Model Overview
Haicaochi/Qwen_05_txtt_V2_Stable is a 0.5 billion parameter instruction-tuned language model, derived from the Qwen/Qwen2.5-0.5B-Instruct architecture. This version has undergone further fine-tuning, achieving a validation loss of 0.3150 during its training process.
Training Details
The model was trained with a learning rate of 5e-05, using an Adam optimizer with betas=(0.9, 0.999) and epsilon=1e-08. Training involved a batch size of 8, with a gradient accumulation of 16 steps, leading to an effective total batch size of 128. A cosine learning rate scheduler with a 0.1 warmup ratio was employed over 20 epochs. Key training results show a progressive decrease in validation loss across epochs.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen2.5-0.5B-Instruct.
- Parameter Count: 0.5 billion parameters.
- Context Length: Supports a context length of 32768 tokens.
- Performance: Achieved a final validation loss of 0.3150.
Potential Use Cases
Given its compact size and instruction-tuned nature, this model is suitable for applications where resource efficiency is important, such as:
- Lightweight text generation.
- Instruction-following tasks on constrained environments.
- Rapid prototyping for language-based applications.