Name: laion/Sera-4.6-Lite-T2-v4-1000-axolotl__Qwen3-8B-v6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

laion/Sera-4.6-Lite-T2-v4-1000-axolotl__Qwen3-8B-v6 is an 8 billion parameter language model built upon the Qwen3 architecture. This model was fine-tuned by laion using the Axolotl framework, specifically addressing stability issues encountered in previous iterations (Sera v3) during long, multi-turn conversations, particularly when processing large tool observations.

Key Capabilities

Extended Context Handling: Trained with a sequence_len of 32768 tokens, significantly enhancing its ability to maintain coherence and stability over extensive conversational histories or large input contexts.
Improved Multi-Turn Stability: The training regimen, which included increasing the dataset size and number of epochs, aimed to prevent degenerate token generation (e.g., "4.4.4.4…" or "for-the-for-the…") that occurred in earlier versions with long contexts.
Axolotl Framework: Developed using Axolotl version 0.16.0.dev0, leveraging its capabilities for efficient large language model training.

Training Details

The model was trained for 6 epochs with a learning rate of 1e-05 and a total batch size of 32, utilizing a cosine learning rate scheduler. The training involved 109 steps with gradient accumulation steps set to 8. The base model for this fine-tuning was Qwen/Qwen3-8B.

Overview

Model Overview

Key Capabilities

Training Details

Full Model Card (README)