Name: laion/sera-316-opt1k__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

laion/sera-316-opt1k__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model was specifically trained on the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--allenai-sera-unified-316/snapshots/ef551d7ec9bb11780e15657490451a6fc6842c46_thinking_preprocessed dataset.

Training Details

The fine-tuning process utilized the following key hyperparameters:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Gradient Accumulation Steps: 3, leading to a total effective training batch size of 96
Optimizer: AdamW_Torch_Fused with betas=(0.85, 0.98) and epsilon=1e-08
LR Scheduler: Cosine type with a warmup ratio of 0.1
Epochs: 7.0
Distributed Training: Multi-GPU setup across 32 devices.

Intended Use

While specific intended uses and limitations are not detailed in the provided information, the model's fine-tuning on a particular dataset suggests its primary utility lies in applications aligned with the characteristics and content of that dataset. Developers should consider the nature of the allenai-sera-unified-316 dataset when evaluating its suitability for their specific use cases.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)