Name: AlexCuadron/Qwen-32B-8a4e8f3a API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: AlexCuadron

Model Overview

This model, AlexCuadron/Qwen-32B-8a4e8f3a, is a fine-tuned variant of the Qwen2.5-32B architecture, developed by AlexCuadron. It comprises 32.8 billion parameters and has a context length of 32768 tokens.

Key Characteristics

Base Model: Built upon the robust Qwen2.5-32B foundation.
Specialized Fine-tuning: The model has undergone specific fine-tuning on the fc_rlm dataset. This targeted training suggests an enhanced capability or performance for tasks aligned with the characteristics and content of this particular dataset.

Training Details

The fine-tuning process utilized the following key hyperparameters:

Learning Rate: 1e-05
Batch Size: A train_batch_size of 2 and eval_batch_size of 8 were used, with a total_train_batch_size and total_eval_batch_size of 64 due to gradient_accumulation_steps of 4 across 8 GPUs.
Optimizer: ADAMW_TORCH with standard betas and epsilon.
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
Epochs: Trained for 4.0 epochs.

Intended Use

While specific intended uses and limitations require more information, the fine-tuning on the fc_rlm dataset implies its suitability for applications where data similar to fc_rlm is prevalent or where the model's performance on this specific distribution is critical.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use

Full Model Card (README)