Jackrong/Qwopus3.6-35B-A3B-v1

TEXT GENERATIONConcurrency Cost:3Model Size:35.1BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 6, 2026License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

Jackrong/Qwopus3.6-35B-A3B-v1 is a 35.1 billion parameter reasoning-enhanced Mixture-of-Experts (MoE) model, fine-tuned by Jackrong on the Qwen3.6-35B-A3B base model. It features 3B active parameters per token for inference efficiency and supports a 262k context window. This model is optimized for deep reasoning, agentic coding, and multimodal tasks, demonstrating strong performance in overall quality and reliability.

Loading preview...

Qwopus3.6-35B-A3B-v1: Reasoning-Enhanced MoE Model

Qwopus3.6-35B-A3B-v1 is a 35.1 billion parameter Mixture-of-Experts (MoE) model, fine-tuned by Jackrong from the Qwen3.6-35B-A3B base model. It leverages a hybrid sparse MoE architecture with 3 billion active parameters per token, ensuring high inference efficiency while supporting a native 262k context window. The model is specifically designed for advanced reasoning, agentic coding, and multimodal applications.

Key Capabilities & Features

  • Enhanced Reasoning: Fine-tuned through a three-stage distributed SFT process to improve structured reasoning and consistent answer styles.
  • High Inference Efficiency: Achieves an average of 161.9 tok/s on an RTX 5090, a 2.6x speedup over dense predecessors, making it suitable for single-GPU consumer hardware.
  • Multimodal Support: Includes vision capabilities and tool calling. Users need to place the mmproj.gguf file alongside the main model file to enable vision.
  • Robust Long-Context Performance: Addresses "thinking starvation" issues, maintaining performance in long-context JSON extraction and multi-step agentic planning.
  • Production-Grade UI/UX Generation: Excels at one-shot HTML/CSS generation, producing complete, functional pages with complex interactions.
  • LoRA Fine-tuning: Utilizes LoRA with approximately 9% of model parameters updated, allowing for deep adaptation of reasoning capabilities.

Use Cases

This model is a premier choice for developers requiring a high-throughput, agentic model that excels at UI/UX generation and complex logical deduction on a single-GPU setup. It is particularly suited for tasks demanding structured reasoning, consistent output, and efficient processing of long contexts.