nerkyor/Lynn-V4-Flash-Distill-Qwen-35B-A3B-BF16-merged
The nerkyor/Lynn-V4-Flash-Distill-Qwen-35B-A3B-BF16-merged is a 35.1 billion parameter Mixture-of-Experts (MoE) model, distilled from DeepSeek-V4-Flash and DeepSeek-V4-Pro teachers, and based on the Qwen3-35B-A3B architecture. This BF16 merged model is optimized for fast daily assistant tasks, excelling in short-to-medium reasoning, tool calling, coding agent functionalities, and multi-style Chinese creative writing. It features a 32768 token context length and demonstrates strong performance in tool-calling and academic holdout evaluations.
Loading preview...
Lynn-V4-Flash-Distill-Qwen-35B-A3B-BF16-merged Overview
This model is a 35.1 billion parameter Mixture-of-Experts (MoE) language model, distilled from DeepSeek-V4-Flash and DeepSeek-V4-Pro teachers, and built upon the Qwen3-35B-A3B base architecture. It is provided in a BF16 merged format, weighing 65.4 GB. The model is specifically designed for the Lynn personal AI assistant ecosystem, focusing on efficiency and practical application.
Key Capabilities
- Fast Daily Assistant: Optimized for short-to-medium reasoning tasks.
- Tool Calling: Supports tool-calling with
qwen3_coderparser semantics for Bash, Read, Edit, Grep, and WebSearch. - Coding Agent: Proficient in algorithm tasks, debugging, and code refactoring.
- Multilingual Creative Writing: Excels in multi-style short-form Chinese creative writing across various platforms.
- Quick Research Summaries: Capable of generating structured outputs of 300-800 characters.
Performance Highlights
The model demonstrates strong evaluation results, passing all 4-gate evaluation thresholds with a NET_WIN score of +51.43pp. It achieves a 60.0% pass rate on V8 strict tool-calling and a 60.0% pass rate on the V9 academic holdout, outperforming its base model by +16.67pp. While BF16 is slower than quantized variants due to memory bandwidth, it offers higher fidelity.
Should I use this for my use case?
This model is ideal for developers needing a robust, fast daily assistant for general Chinese/English conversation, tool-augmented tasks, and coding support. If your application requires short-to-medium reasoning, structured research summaries, or multi-style Chinese creative writing, this model is a strong candidate. For long-form structured research output (>= 1500 characters), the V4-Pro variant is recommended. Note that it is not optimized for pure single-language outputs (Chinese-dominant training) and math/coding outputs are evaluated via reference-similarity, not formal benchmarks like GSM8K or HumanEval+.