Name: brandolorian/answer-Qwen-stioning API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: brandolorian

Model Overview

The brandolorian/answer-Qwen-stioning model is a fine-tuned variant of the Qwen1.5-0.5B architecture, developed by brandolorian. This 0.6 billion parameter model is specifically optimized for question-answering tasks, building upon the foundational capabilities of the Qwen series.

Key Characteristics

Base Model: Fine-tuned from Qwen/Qwen1.5-0.5B.
Parameter Count: Features 0.6 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a substantial context window of 32768 tokens, enabling it to process and understand longer queries and documents.
Performance Metrics: Achieved an eval_loss of 2.6400 during its evaluation, with an eval_samples_per_second of 178.744.

Training Details

The model was trained with a learning rate of 2e-05, a train_batch_size of 16, and num_epochs set to 9. It utilized mixed-precision training (Native AMP) and the Adam optimizer. The training process involved Transformers 4.38.0.dev0 and Pytorch 2.1.0+cu121.

Intended Use Cases

This model is primarily suited for applications requiring efficient and accurate question-answering capabilities, particularly where the base Qwen1.5-0.5B architecture is a good fit. Its fine-tuning suggests improved performance on tasks involving extracting information or generating direct answers from provided contexts.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)