Name: ShourenWSR/HT-phase_scale-Qwen-140k-phase2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ShourenWSR

Model Overview

ShourenWSR/HT-phase_scale-Qwen-140k-phase2 is a 7.6 billion parameter language model, representing the second phase of fine-tuning in a multi-stage training process. It is built upon an earlier version, Qwen_phase1_140k, and further fine-tuned on a dataset referred to as phase2_140k. This iterative approach suggests a progressive refinement of the model's capabilities.

Training Details

The model was trained using specific hyperparameters to optimize its performance:

Learning Rate: 1e-05
Batch Size: A train_batch_size of 1 with gradient_accumulation_steps of 12 resulted in a total_train_batch_size of 24.
Optimizer: adamw_torch with default betas and epsilon.
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
Epochs: Trained for 3.0 epochs.
Environment: Distributed training across 2 GPUs.

Intended Use Cases

While specific intended uses are not detailed, as a fine-tuned Qwen-based model, it is generally suitable for a wide range of natural language processing tasks, including text generation, summarization, and question answering, benefiting from its 7.6B parameter count and 32768 token context length.

Overview

Model Overview

Training Details

Intended Use Cases

Full Model Card (README)