abeja/ABEJA-QwQ32b-Reasoning-Japanese-v1.0

Warm
Public
32.8B
FP8
32768
2
Mar 25, 2025
License: apache-2.0
Hugging Face

ABEJA-QwQ32b-Reasoning-Japanese-v1.0 is a 32.8 billion parameter Japanese reasoning model developed by ABEJA. It is based on abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1, which itself is a Qwen2.5-32B-Instruct model continuously pre-trained with a focus on Japanese. This model integrates the ChatVector from Qwen/QwQ-32B and undergoes additional training to enhance its Japanese reasoning capabilities, specifically designed to output a final answer after an explicit thought process enclosed in <think></think> tags.

Overview

Model Overview

ABEJA-QwQ32b-Reasoning-Japanese-v1.0 is a 32.8 billion parameter language model developed by ABEJA, specifically engineered for enhanced Japanese reasoning. It builds upon abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1, a Japanese-centric continuous pre-trained version of Qwen/Qwen2.5-32B-Instruct. A key differentiator is the integration of the ChatVector from Qwen/QwQ-32B, followed by additional fine-tuning to optimize its reasoning performance in Japanese.

Key Capabilities & Features

  • Explicit Reasoning Process: The model is designed to generate a thought process enclosed within <think></think> tags before producing its final output, promoting transparency and structured reasoning.
  • Japanese Language Optimization: Developed with a strong focus on Japanese, ensuring robust performance for tasks requiring complex reasoning in the language.
  • Qwen/QwQ-32B Integration: Incorporates characteristics from Qwen/QwQ-32B, suggesting a foundation in advanced conversational and reasoning architectures.

Usage Guidelines & Recommendations

To achieve optimal performance, users are advised to follow specific usage guidelines, many of which are automatically handled by apply_chat_template:

  • Forced Thought Process: Start output after <think>\n to ensure the reasoning process is engaged.
  • Recommended Parameters: Use temperature=0.6, top_p=0.95, min_p=0, and top_k between 20 and 40.
  • Multi-turn Conversations: Exclude <think></think> sections from conversation history in multi-turn interactions.
  • No System Prompt: Begin directly with role:user messages.