rikunarita/Qwen3-4B-Thinking-2507-Genius-v2
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 9, 2026Architecture:Transformer0.0K Warm

The rikunarita/Qwen3-4B-Thinking-2507-Genius-v2 is a 4 billion parameter causal language model, merged using the Model Stock method from Qwen3-4B-Thinking-2507-Genius, Qwen3-4B-Thinking-2507-Genius-Coder, and Qwen3-4B-Thinking-2507-Genius-Mathematician. It features a native context length of 262,144 tokens and is specifically optimized for complex reasoning tasks, including logical reasoning, mathematics, science, and coding. This model is designed for scenarios requiring deep analytical thought and enhanced long-context understanding.

Loading preview...

Overview

This model, rikunarita/Qwen3-4B-Thinking-2507-Genius-v2, is a 4 billion parameter causal language model built upon the Qwen3 architecture. It was created using the Model Stock merge method, combining specialized versions of Qwen3-4B-Thinking-2507 focused on general reasoning, coding, and mathematics. This integration aims to enhance its overall "thinking capability" and performance across various complex tasks.

Key Capabilities

  • Enhanced Reasoning: Demonstrates significantly improved performance on tasks requiring logical reasoning, mathematics (e.g., AIME25, HMMT25), science, and coding (e.g., LiveCodeBench, CFEval).
  • Extended Context Understanding: Features a native context length of 262,144 tokens, with recommendations for using at least 131,072 tokens for optimal reasoning performance.
  • Agentic Use: Excels in tool-calling capabilities, with support for frameworks like Qwen-Agent to simplify integration.
  • General Improvements: Shows better instruction following, tool usage, and alignment with human preferences.

Good For

  • Highly Complex Reasoning Tasks: Recommended for scenarios demanding deep analytical thought, problem-solving, and multi-step reasoning.
  • Mathematical and Scientific Problem Solving: Particularly strong in academic benchmarks related to math and science.
  • Code Generation and Analysis: Improved performance in coding challenges and tasks.
  • Long-Context Applications: Ideal for processing and understanding extensive documents or conversations due to its large context window.

Usage Notes

This model operates in a dedicated "thinking mode" by default, and specifying enable_thinking=True is no longer required. The default chat template automatically includes a closing </think> tag, even if an explicit opening <think> tag is not present in the output.