Name: ShourenWSR/HT-ht-analysis-Qwen-think-only API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ShourenWSR

Overview

This model, named Qwen_think_only, is a fine-tuned variant of the Qwen/Qwen2.5-7B architecture, featuring 7.6 billion parameters and a substantial 32768-token context window. It has undergone specialized training on the ht-analysis_think_only dataset.

Key Capabilities

Specialized Fine-tuning: Optimized through training on the ht-analysis_think_only dataset, indicating a focus on specific analytical or reasoning tasks.
Base Model: Built upon the robust Qwen2.5-7B foundation, inheriting its general language understanding and generation capabilities.
Extended Context Window: Supports a 32768-token context length, enabling the processing of lengthy and complex inputs.

Training Details

The model was trained using the following hyperparameters:

Learning Rate: 1e-05
Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
Epochs: 3.0
Batch Size: 1 (train), 8 (eval) with 12 gradient accumulation steps, resulting in a total effective batch size of 24.

Good For

Applications requiring analysis or reasoning based on the ht-analysis_think_only dataset's characteristics.
Tasks benefiting from a large context window for detailed information processing.

Limitations

Specific intended uses and limitations are not fully detailed in the provided information, suggesting further evaluation is needed for diverse applications.

Overview

Overview

Key Capabilities

Training Details

Good For

Limitations

Full Model Card (README)