bigatuna/Qwen3-0.6B-Sushi-Coder
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Dec 31, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The bigatuna/Qwen3-0.6B-Sushi-Coder is a 0.8 billion parameter causal language model, fine-tuned from Qwen3-0.6B, specifically optimized for Python code generation. It leverages a two-stage training process including GRPO and Supervised Fine-Tuning on code datasets. This model demonstrates enhanced performance in Python code generation, achieving a 29.3% pass@1 on HumanEval, significantly outperforming its base model. It is best suited for applications requiring efficient and accurate Python code generation within its 40960 token context length.

Loading preview...