laion/Sera-4.5A-Full-T1-v3-1000-axolotl__Qwen3-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 22, 2026Architecture:Transformer Cold

laion/Sera-4.5A-Full-T1-v3-1000-axolotl__Qwen3-8B is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B by laion. This model was trained using the axolotl framework on the laion/Sera-4.5A-Full-T1-v3-1000 dataset, featuring a substantial context length of 32768 tokens. It is designed for general language generation tasks, leveraging its Qwen3 base and extensive fine-tuning data.

Loading preview...

Model Overview

laion/Sera-4.5A-Full-T1-v3-1000-axolotl__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the base model Qwen/Qwen3-8B. This model was developed by laion using the axolotl framework, specifically version 0.16.0.dev0.

Key Characteristics

  • Base Model: Qwen/Qwen3-8B
  • Parameter Count: 8 billion
  • Context Length: Supports a sequence length of 32768 tokens, enabling processing of long inputs.
  • Training Data: Fine-tuned on the laion/Sera-4.5A-Full-T1-v3-1000 dataset, which is a JSONL dataset with messages formatted for chat templates.
  • Training Configuration: Utilizes bf16 precision, flash_attention, and gradient_checkpointing for efficient training.
  • Optimizer: Trained with adamw_torch optimizer, a learning rate of 1e-05, and a cosine learning rate scheduler.

Intended Use Cases

This model is suitable for general-purpose language generation and understanding tasks, benefiting from its large context window and fine-tuning on a diverse dataset. Its Qwen3 base provides a strong foundation for various applications requiring robust language capabilities.