taronklm/trained_model
taronklm/trained_model is a 0.5 billion parameter language model fine-tuned from Qwen/Qwen2.5-0.5B-Instruct. This model, developed by taronklm, is optimized for text generation tasks, demonstrating a Bertscore F1 of 0.9321 on its evaluation set. It leverages a 32768 token context length, making it suitable for applications requiring processing of moderately long inputs.
Loading preview...
Model Overview
taronklm/trained_model is a 0.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-0.5B-Instruct base model. It was trained on a 'generator' dataset using the PEFT library, focusing on text generation capabilities. The model demonstrates a Bertscore F1 score of 0.9321 on its evaluation set, with a precision of 0.9305 and recall of 0.9338.
Key Training Details
- Base Model: Qwen/Qwen2.5-0.5B-Instruct
- Fine-tuning Dataset: 'generator'
- Training Framework: PEFT (Parameter-Efficient Fine-Tuning)
- Hyperparameters:
- Learning Rate: 0.0001
- Epochs: 5
- Optimizer: Adam with betas=(0.9, 0.999)
- Gradient Accumulation Steps: 8
Performance Metrics
During its 5-epoch training, the model achieved a final validation loss of 0.5432. The Bertscore metrics indicate strong performance in terms of semantic similarity between generated and reference texts.
Intended Use Cases
Given its fine-tuning on a 'generator' dataset and its performance metrics, this model is well-suited for various text generation tasks where a compact yet capable model is required. Its 0.5 billion parameters make it efficient for deployment in resource-constrained environments, while the 32768 token context length allows for handling substantial input lengths.