Name: nerkyor/Qwen3.6-35B-A3B-DSV4Pro-Thinking-Distill API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: nerkyor

Model Overview

This model, nerkyor/Qwen3.6-35B-A3B-DSV4Pro-Thinking-Distill, is a 35-billion parameter Mixture-of-Experts (MoE) model with 3 billion active parameters, built on the Qwen3.6 architecture. It is specifically designed as a high-end local orchestrator for the Lynn Agent, serving as a sparse counterpart to a 27B dense sister model. The core innovation lies in its distillation process: using LoRA, it learns the reasoning style and agentic behavior of DeepSeek-V4-Pro, particularly its 'thinking-on' approach for task decomposition, delegation, and verification.

Key Capabilities & Differentiators

Task Orchestration: Purpose-built for efficient task management within the Lynn Agent, enabling faster decision-making and convergence.
Enhanced Reasoning: Achieves a +7.6 percentage point improvement on GPQA-Diamond-198, indicating significantly better performance in hard reasoning tasks.
Faster End-to-End Orchestration: Demonstrates a 2.3x speedup in end-to-end orchestration time due to fewer tokens required for decision-making.
Reduced Ambiguity: Significantly decreases non-terminating empty answers (from 12 to 1 on GPQA), showcasing improved decisiveness.
Native MTP (nextn) Support: Includes a native speculative decoding head for single-stream acceleration, with speedups up to 1.63x for Q8_0 quantization.
Distilled Thinking Style: Focuses on learning how to reason and converge rather than injecting new knowledge, making it adept at complex problem-solving workflows.

Limitations

Knowledge Ceiling: Distillation focuses on thinking style, not knowledge, leading to a slight dip in MMLU scores (~1.2pp) compared to the base model.
Specialized Role: Primarily an orchestrator, not a broad knowledge model. Its strength lies in agentic workflows rather than general knowledge breadth.

Recommended Use Cases

Lynn Agent Deployments: Ideal for local orchestration on machines with 32GB+ VRAM/unified memory.
Complex Task Management: Suited for applications requiring robust task decomposition, delegation, and verification.
Agentic Workflows: Excellent for scenarios where a model needs to reason through steps, call tools, and converge on solutions efficiently.

Overview

Model Overview

Key Capabilities & Differentiators

Limitations

Recommended Use Cases

Full Model Card (README)