Jackrong/Qwopus3.5-27B-v3

VISIONConcurrency Cost:2Model Size:27BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Apr 1, 2026License:apache-2.0Architecture:Transformer0.2K Open Weights Cold

Jackrong/Qwopus3.5-27B-v3 is a 27 billion parameter reasoning-enhanced language model based on Qwen3.5-27B, fine-tuned to improve reasoning stability, correctness, and inference efficiency. It utilizes a structural alignment approach for Chain-of-Thought optimization and incorporates specialized reinforcement learning for tool-calling. This model is designed for complex, multi-step agentic workflows, excelling in programming tasks and offline analytical scenarios by emphasizing an "act-then-refine" paradigm.

Loading preview...

Qwopus3.5-27B-v3: Reasoning-Enhanced LLM

Qwopus3.5-27B-v3 is a 27 billion parameter model built upon Qwen3.5-27B, focusing on enhancing reasoning capabilities and inference efficiency. It introduces a novel "act-then-refine" paradigm for multi-step agent systems, shifting from pre-action deliberation to execution-driven refinement based on environmental feedback.

Key Capabilities & Differentiators

  • Structural Reasoning Optimization: Moves beyond simple CoT distillation by focusing on faithful, complete, and structurally clear reasoning traces, leading to process-level reasoning learning rather than just answer imitation. This results in higher generalization and robustness on unseen tasks.
  • Tool-Calling Reinforcement: Incorporates specialized RL training to improve stability and proficiency in tool invocation within tool-augmented agent frameworks like OpenClaw.
  • Performance: Achieves a strong balance between accuracy and efficiency, matching or outperforming Qwen3.5-27B on most tasks while using significantly fewer generated tokens. On HumanEval, Qwopus3.5-27B-v3 scored 95.73% (157/164), surpassing Qwen3.5-27B (94.51%).
  • Training Approach: Fine-tuned using Unsloth on a high-fidelity reasoning dataset, with a focus on masking response-only training.

Good For

  • Offline Analytical Tasks: Excels in scenarios requiring transparent, step-by-step internal logic.
  • Coding & Mathematical Reasoning: Demonstrated strong performance on HumanEval, indicating proficiency in programming tasks.
  • Agentic Workflows: Optimized for complex, multi-step agent systems that benefit from iterative interaction and correction.

Limitations

  • As an autoregressive LLM, it carries a risk of hallucination, especially for real-world events within thinking sequences.
  • The model's reasoning chain (CoT) may occasionally exhibit instability, logic loops, or reasoning drift due to its independent development with limited resources.