Name: laion/Sera-4.6-Lite-T2-v4-316-axolotl__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

This model, laion/Sera-4.6-Lite-T2-v4-316-axolotl__Qwen3-8B, is an 8 billion parameter language model built upon the Qwen/Qwen3-8B architecture. It has been fine-tuned using the Axolotl framework, specifically leveraging the laion/Sera-4.6-Lite-T2-v4-316 dataset. A key characteristic of its training data is the inclusion of pre-rendered tool calls, formatted according to the Hermes/Qwen3 wire format, which are integrated directly into the content.

Key Capabilities

Tool Call Integration: Trained with tool calls pre-rendered into the input, suggesting proficiency in understanding and potentially generating structured interactions or function calls.
Qwen3-8B Base: Benefits from the strong foundational capabilities of the Qwen3-8B model.
Extended Context Window: Features a sequence_len of 32768 tokens, enabling processing of longer inputs and maintaining context over extended dialogues or documents.

Training Details

The model was trained with a learning rate of 1e-05, using the AdamW optimizer with specific beta parameters. It utilized a total batch size of 32 across 4 GPUs, with gradient accumulation steps set to 8. The training involved 17 steps over 3 epochs, with a cosine learning rate scheduler and a warmup ratio of 0.1875.

Good For

Applications requiring tool use or function calling capabilities.
Tasks benefiting from a large context window.
Developers looking for a fine-tuned Qwen3-8B variant with specialized training for structured outputs.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)