Name: laion/sera-316__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

The laion/sera-316__Qwen3-8B is an 8 billion parameter language model, building upon the robust Qwen/Qwen3-8B architecture. This model has undergone a specific fine-tuning process using the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--allenai-sera-unified-316 dataset. It supports a substantial context length of 32768 tokens, enabling it to process and generate longer sequences of text.

Training Details

The fine-tuning was conducted with the following key hyperparameters:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Gradient Accumulation: 3 steps, leading to a total effective training batch size of 96
Optimizer: AdamW_Torch_Fused with betas=(0.9, 0.98) and epsilon=1e-08
Epochs: 7.0
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio

This specialized training on the allenai-sera-unified-316 dataset suggests its potential utility in applications aligned with the characteristics and content of that specific data source. Further details on intended uses and limitations would require more information about the dataset's nature.

Overview

Model Overview

Training Details

Full Model Card (README)