taronklm/trained_model

TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Nov 28, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

taronklm/trained_model is a 0.5 billion parameter language model fine-tuned from Qwen/Qwen2.5-0.5B-Instruct. This model, developed by taronklm, is optimized for text generation tasks, demonstrating a Bertscore F1 of 0.9321 on its evaluation set. It leverages a 32768 token context length, making it suitable for applications requiring processing of moderately long inputs.

Loading preview...

Model Overview

taronklm/trained_model is a 0.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-0.5B-Instruct base model. It was trained on a 'generator' dataset using the PEFT library, focusing on text generation capabilities. The model demonstrates a Bertscore F1 score of 0.9321 on its evaluation set, with a precision of 0.9305 and recall of 0.9338.

Key Training Details

  • Base Model: Qwen/Qwen2.5-0.5B-Instruct
  • Fine-tuning Dataset: 'generator'
  • Training Framework: PEFT (Parameter-Efficient Fine-Tuning)
  • Hyperparameters:
    • Learning Rate: 0.0001
    • Epochs: 5
    • Optimizer: Adam with betas=(0.9, 0.999)
    • Gradient Accumulation Steps: 8

Performance Metrics

During its 5-epoch training, the model achieved a final validation loss of 0.5432. The Bertscore metrics indicate strong performance in terms of semantic similarity between generated and reference texts.

Intended Use Cases

Given its fine-tuning on a 'generator' dataset and its performance metrics, this model is well-suited for various text generation tasks where a compact yet capable model is required. Its 0.5 billion parameters make it efficient for deployment in resource-constrained environments, while the 32768 token context length allows for handling substantial input lengths.