laion/sera-subset-mixed-10000-axolotl__Qwen3-8B-v8

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 29, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The laion/sera-subset-mixed-10000-axolotl__Qwen3-8B-v8 is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. It was trained using axolotl on a 10,000-row mixed subset of the `ethanlshen/sera-subset` dataset, specifically optimized for tasks related to the SERA recipe. This model features a 32,768 token context length and is designed for applications requiring robust performance on structured reasoning and agentic tasks.

Loading preview...

Model Overview

This model, laion/sera-subset-mixed-10000-axolotl__Qwen3-8B-v8, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has undergone Supervised Fine-Tuning (SFT) using the axolotl framework, specifically targeting the SERA recipe for improved performance on complex reasoning tasks.

Key Training Details

The model was fine-tuned on a 10,000-row random mixed subset of the ethanlshen/sera-subset dataset, which includes both unresolved (stage1) and resolved (stage2) data. Key hyperparameters for training include:

  • Learning Rate: 1e-5
  • Batch Size: 32 (global)
  • Epochs: 3
  • Context Length: 32,768 tokens
  • Chat Template: ChatML

Training utilized bf16 precision and DeepSpeed Zero3 for optimization, following the iteration i9, version v8 of the upstream SERA recipe from the open-thoughts/OpenThoughts-Agent repository.

Intended Use Cases

This model is particularly well-suited for applications that benefit from its specialized fine-tuning on the SERA dataset, which focuses on structured reasoning and agentic capabilities. Its large context window makes it suitable for processing and understanding lengthy inputs relevant to such tasks.