rinna/qwq-bakeneko-32b

Warm
Public
32.8B
FP8
32768
2
Mar 12, 2025
License: apache-2.0
Hugging Face

rinna/qwq-bakeneko-32b is a 32.8 billion parameter instruction-tuned reasoning model developed by rinna, based on the Qwen2.5 architecture. It is fine-tuned using Chat Vector and Odds Ratio Preference Optimization (ORPO) to deliver superior performance in Japanese language tasks. This model is specifically designed for reasoning applications and adheres to the Qwen/QwQ-32B chat format. It offers a 131072 token context length, making it suitable for complex Japanese language processing.

Overview

Overview of rinna/qwq-bakeneko-32b

rinna/qwq-bakeneko-32b is a 32.8 billion parameter instruction-tuned reasoning model, a variant of the rinna/qwen2.5-bakeneko-32b base model. Developed by rinna, this model is specifically optimized for Japanese language tasks and adheres to the Qwen/QwQ-32B chat format. It features a 64-layer, 5120-hidden-size transformer-based architecture, with a notable context length of 131072 tokens.

Key Capabilities and Training

  • Instruction-Tuned Reasoning: The model is fine-tuned for reasoning tasks, leveraging a multi-stage training process.
  • Chat Vector Merging: It was created by augmenting the base model with instruction-following capabilities through a Chat Vector addition process, derived from Qwen/QwQ-32B and Qwen/Qwen2.5-32B.
  • ORPO Refinement: Further refinement was achieved using Odds Ratio Preference Optimization (ORPO), trained on 1.3k curated data samples generated by DeepSeek-R1.
  • Japanese Language Focus: Benchmarking results indicate strong performance in Japanese LM Evaluation Harness and Japanese MT-Bench (first and multi-turn) compared to its base models and other Qwen variants.

When to Use This Model

  • Japanese Language Applications: Ideal for applications requiring high-performance language understanding and generation in Japanese.
  • Reasoning Tasks: Suited for tasks that benefit from instruction-tuned reasoning capabilities.
  • Long Context Processing: Its 131072 token context length makes it effective for processing and generating extensive Japanese text.