zake7749/gemma-4-31B-it-chinese-reasoning-preview-e1

VISIONConcurrency Cost:2Model Size:31BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Apr 23, 2026License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

zake7749/gemma-4-31B-it-chinese-reasoning-preview-e1 is a 31 billion parameter Gemma-4 instruction-tuned model, based on google/gemma-4-31B-it, specifically adapted for Chinese reasoning. This early research preview aims to shift the model's reasoning behavior towards Chinese inputs while preserving general reasoning quality and reducing unnecessary reasoning-token usage. It is designed for exploring improved Chinese reasoning through targeted trajectories and motivating collaboration on Chinese-native reasoning and agent post-training.

Loading preview...

Overview

zake7749/gemma-4-31B-it-chinese-reasoning-preview-e1 is an early research preview of a 31 billion parameter Gemma-4 instruction-tuned model, based on google/gemma-4-31B-it. Its primary goal is to adapt the base model to reason more directly and efficiently in Chinese when presented with Chinese user input. This adaptation focuses on preserving general reasoning quality while minimizing the use of extraneous reasoning tokens.

Key Capabilities

  • Chinese Reasoning Alignment: Specifically post-trained to enhance reasoning behavior for Chinese language inputs.
  • Reduced Reasoning-Token Usage: Aims to optimize the efficiency of the reasoning process by reducing unnecessary token generation.
  • Research Focus: Serves as an experimental checkpoint to explore the improvement of Chinese reasoning through targeted trajectories and to foster broader collaboration in Chinese-native reasoning and agent post-training.

Limitations (Early Research Preview)

  • Fragile Cross-lingual Transfer: The base model is English-optimized, and Chinese reasoning performance may still be unstable due to limited Chinese reasoning trajectories.
  • Limited Training Scale: Trained with a restricted number of reasoning trajectories for only one epoch, which is insufficient for full stabilization.
  • Narrow Training Distribution: Post-training data is heavily skewed towards general chat, potentially limiting generalization to domains like math, coding, tool use, or agentic tasks. Chinese writing quality may also show a noticeable gap compared to well-trained models like Monomer.