Name: shuoxing/qwen2-5-7b-full-pretrain-mix-low-tweet-1m-en-reproduce-bs8 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: shuoxing

Overview

This model, shuoxing/qwen2-5-7b-full-pretrain-mix-low-tweet-1m-en-reproduce-bs8, is a 7.6 billion parameter language model. It is a fine-tuned variant of the Qwen/Qwen2.5-7B-Instruct base model, developed by Qwen. The fine-tuning process specifically utilized the mix_low_tweet_1m_new dataset.

Key Characteristics

Base Model: Qwen/Qwen2.5-7B-Instruct.
Parameter Count: 7.6 billion parameters.
Context Length: 131,072 tokens.
Training Data: Fine-tuned on the mix_low_tweet_1m_new dataset, indicating a specialization in short-form, potentially social media-style text.

Training Details

The model underwent training with a learning rate of 1e-05, a train_batch_size of 1, and num_epochs set to 3.0. It utilized a multi-GPU setup with 8 devices and an AdamW optimizer with cosine learning rate scheduling and a 0.1 warmup ratio.

Intended Use Cases

Given its fine-tuning on a dataset likely comprising short, informal text, this model is potentially well-suited for tasks such as:

Generating or analyzing social media posts.
Understanding and responding to short-form conversational text.
Applications requiring text generation with a specific, concise style.

Overview

Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)