Name: laion/Sera-4.6-Lite-T2-v4-1000-axolotl__Qwen3-8B-v7 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

This model, laion/Sera-4.6-Lite-T2-v4-1000-axolotl__Qwen3-8B-v7, is an 8 billion parameter language model fine-tuned from the Qwen/Qwen3-8B base model. It was developed using the Axolotl framework, with a focus on improving performance in long, multi-turn conversational contexts.

Key Differentiator

The primary goal of this fine-tuning was to resolve issues observed in previous Sera versions, where the model's output would degrade (e.g., "4.4.4.4…" or "for-the-for-the…") when encountering tool observations larger than approximately 20KB within a multi-turn conversation. The training dataset was scaled up significantly (from 316 to 1000 rows) and the number of epochs increased to enhance the model's stability and prevent token degeneration in extended contexts.

Training Details

Base Model: Qwen/Qwen3-8B
Dataset: laion/Sera-4.6-Lite-T2-v4-1000 (a JSONL dataset)
Sequence Length: 32768 tokens
Learning Rate: 1e-05
Optimizer: AdamW with betas=(0.9, 0.95)
Epochs: 12
Gradient Accumulation Steps: 8
Total Training Steps: 218

Intended Use Cases

This model is particularly suited for applications requiring stable and coherent responses in long, multi-turn dialogues, especially those involving the integration of large tool observations or complex contextual information. Its enhanced stability in extended contexts makes it a robust choice for agents or conversational AI systems that handle detailed interactions.

Overview

Model Overview

Key Differentiator

Training Details

Intended Use Cases

Full Model Card (README)