latte-agent/qwen3-4b-latte-v6
latte-agent/qwen3-4b-latte-v6 is a 4 billion parameter LoRA fine-tune of Qwen3-4B-Instruct-2507, developed by latte-agent. This model represents the final iteration of a voice-distillation program, aiming to synthesize new skills and address prior failure modes. It is primarily an archived research artifact, not intended for production use, and focuses on stylistic signature rather than improved factuality.
Loading preview...
Overview
latte-agent/qwen3-4b-latte-v6 is a 4 billion parameter LoRA fine-tune of the Qwen3-4B-Instruct-2507 base model, developed by latte-agent. This model, designated as v6, is the final iteration of the "Latte distillation program," a research initiative focused on distilling a specific "voice" into the model. Despite achieving the lowest validation loss in the program, v6 did not demonstrate significant improvement in blind evaluation against the base model or previous versions, leading to the program's closure.
Key Characteristics
- Architecture: LoRA fine-tune of Qwen3-4B-Instruct-2507.
- Parameter Count: 4 billion parameters.
- Context Length: 32768 tokens.
- Training Data: Dataset of 567 pairs, including real voice anchors, skill-anchored Q&A, and corrective pairs.
- Performance: Evaluation showed v6 did not consistently outperform the base model or v5 in blind voice-fit assessments, despite improved validation loss.
- Research Status: This model is an archived research artifact; the distillation program is closed, and it is not recommended for production use.
Usage Notes
- Not for Production: The model's developers explicitly state it is not advancing user-facing quality and recommend using the base
qwen3:4b-instruct-2507-q4_K_Mfor production. - Focus: The distillation process amplified stylistic signature but did not improve underlying factuality.
- Available Formats: Provided in MLX LoRA, HF/bfloat16 fused, and GGUF (F16, Q4_K_M) formats for those interested in research or experimentation.