Name: hamishivi/tmax-qwen3-4b-sft-20260316-100k-asst-loss API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: hamishivi

Model Overview

The hamishivi/tmax-qwen3-4b-sft-20260316-100k-asst-loss is a 4 billion parameter language model built upon the Qwen3 architecture. It has been developed by hamishivi and fine-tuned using the Hugging Face TRL (Transformers Reinforcement Learning) library. This model is designed with a significant context window of 32,768 tokens, allowing it to handle complex and lengthy conversational exchanges.

Key Capabilities

Qwen3 Architecture: Leverages the robust foundation of the Qwen3 model family.
Supervised Fine-Tuning (SFT): Enhanced through SFT, indicating a focus on improving performance for specific tasks, likely conversational or instruction-following.
Extended Context Length: Supports a 32,768 token context, beneficial for maintaining coherence over long dialogues or processing large documents.
TRL Framework: Training utilized the TRL framework, a common tool for fine-tuning transformer models.

Good For

Assistant-like Applications: The fine-tuning process suggests suitability for roles requiring interactive responses, such as chatbots or virtual assistants.
Long-form Conversations: Its large context window makes it well-suited for maintaining context and generating relevant responses over extended dialogues.
Instruction Following: SFT often improves a model's ability to understand and execute user instructions effectively.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)