Name: shuoxing/qwen2-5-7b-full-pretrain-mix-high-tweet-1m-en-reproduce-bs8 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: shuoxing

Model Overview

This model, shuoxing/qwen2-5-7b-full-pretrain-mix-high-tweet-1m-en-reproduce-bs8, is a fine-tuned variant of the Qwen2.5-7B-Instruct base model. It features 7.6 billion parameters and was trained with a context length of 131,072 tokens.

Key Characteristics

Base Model: Fine-tuned from Qwen/Qwen2.5-7B-Instruct, indicating a strong foundation in instruction following and general language understanding.
Specialized Training Data: The model underwent further training on the mix_high_tweet_1m_new dataset. This suggests a specialization in processing and generating content similar to high-volume social media posts, potentially enhancing its performance on informal or concise text.

Training Details

The fine-tuning process involved specific hyperparameters:

Learning Rate: 1e-05
Optimizer: ADAMW_TORCH_FUSED
Epochs: 3.0
Batch Size: A total training batch size of 8 across 8 devices.

Potential Use Cases

Given its fine-tuning on a tweet-like dataset, this model could be particularly effective for:

Analyzing and generating short-form text.
Tasks related to social media content understanding or creation.
Applications requiring processing of informal language styles.

Overview

Model Overview

Key Characteristics

Training Details

Potential Use Cases

Full Model Card (README)