hamishivi/tmax-qwen3-4b-sft-20260316-100k-asst-loss
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 16, 2026Architecture:Transformer Warm

The hamishivi/tmax-qwen3-4b-sft-20260316-100k-asst-loss model is a 4 billion parameter Qwen3-based language model, fine-tuned using the TRL framework. It features a substantial 32,768 token context length, making it suitable for processing extensive inputs. This model is specifically optimized for assistant-like conversational tasks, leveraging supervised fine-tuning (SFT) to enhance its interactive capabilities.

Loading preview...