nandansarkar/base_qwen3_0-6B_filter
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Dec 11, 2025License:otherArchitecture:Transformer Warm

The nandansarkar/base_qwen3_0-6B_filter model is a fine-tuned version of the Qwen3-0.6B architecture, developed by nandansarkar. This 0.8 billion parameter causal language model has a context length of 40960 tokens. It is specifically fine-tuned on the sft_dataset, suggesting an optimization for specific supervised fine-tuning tasks.

Loading preview...