Name: latte-agent/qwen3-4b-latte-v6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: latte-agent

Overview

latte-agent/qwen3-4b-latte-v6 is a 4 billion parameter LoRA fine-tune of the Qwen3-4B-Instruct-2507 base model, developed by latte-agent. This model, designated as v6, is the final iteration of the "Latte distillation program," a research initiative focused on distilling a specific "voice" into the model. Despite achieving the lowest validation loss in the program, v6 did not demonstrate significant improvement in blind evaluation against the base model or previous versions, leading to the program's closure.

Key Characteristics

Architecture: LoRA fine-tune of Qwen3-4B-Instruct-2507.
Parameter Count: 4 billion parameters.
Context Length: 32768 tokens.
Training Data: Dataset of 567 pairs, including real voice anchors, skill-anchored Q&A, and corrective pairs.
Performance: Evaluation showed v6 did not consistently outperform the base model or v5 in blind voice-fit assessments, despite improved validation loss.
Research Status: This model is an archived research artifact; the distillation program is closed, and it is not recommended for production use.

Usage Notes

Not for Production: The model's developers explicitly state it is not advancing user-facing quality and recommend using the base qwen3:4b-instruct-2507-q4_K_M for production.
Focus: The distillation process amplified stylistic signature but did not improve underlying factuality.
Available Formats: Provided in MLX LoRA, HF/bfloat16 fused, and GGUF (F16, Q4_K_M) formats for those interested in research or experimentation.

Overview

Overview

Key Characteristics

Usage Notes

Full Model Card (README)