laion/Sera-4.6-Lite-T2-v4-1000-axolotl__Qwen3-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 23, 2026Architecture:Transformer Cold

laion/Sera-4.6-Lite-T2-v4-1000-axolotl__Qwen3-8B is an 8 billion parameter causal language model, fine-tuned by laion based on the Qwen3-8B architecture. This model was trained using Axolotl on the Sera-4.6-Lite-T2-v4-1000 dataset, which incorporates pre-rendered tool calls in the Hermes/Qwen3 wire format. It is optimized for tasks involving tool use and function calling, leveraging a 32768 token context length.

Loading preview...

Model Overview

laion/Sera-4.6-Lite-T2-v4-1000-axolotl__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B base model. It was developed by laion using the Axolotl framework, specifically version 0.16.0.dev0.

Key Characteristics

  • Base Model: Qwen/Qwen3-8B, a robust foundation for various NLP tasks.
  • Training Dataset: Fine-tuned on laion/Sera-4.6-Lite-T2-v4-1000, a specialized dataset that includes pre-rendered tool calls in the Hermes/Qwen3 wire format. This suggests an optimization for tool-use and function-calling capabilities.
  • Context Length: Features a substantial sequence length of 32768 tokens, enabling processing of longer inputs and maintaining context over extended conversations or documents.
  • Training Configuration: Utilizes a chatml chat template and was trained with a learning rate of 1e-05 over 3 epochs, employing adamw_torch optimizer and cosine learning rate scheduler.

Potential Use Cases

This model is likely well-suited for applications requiring advanced function calling, agentic workflows, or scenarios where understanding and generating responses based on structured tool interactions are crucial. Its large context window also makes it suitable for tasks involving extensive document analysis or complex multi-turn dialogues.