Overview
rinna/qwen2.5-bakeneko-32b-instruct-v2 is an instruction-tuned variant of the rinna/qwen2.5-bakeneko-32b model, developed by rinna. It leverages a two-stage training process involving model merging and further refinement through distillation and Odds Ratio Preference Optimization (ORPO).
Key Capabilities
- Enhanced Instruction Following: Achieved through Chat Vector addition, improving upon its predecessor, rinna/qwen2.5-bakeneko-32b-instruct.
- Reasoning Performance: Demonstrates strong reasoning capabilities on Japanese MT-Bench, comparable to dedicated reasoning models, but without requiring explicit reasoning processes.
- Optimized for Japanese: Specifically fine-tuned to excel in Japanese language tasks, as indicated by its performance on Japanese MT-Bench benchmarks.
- Qwen2.5 Architecture: Built upon the 64-layer, 5120-hidden-size transformer-based Qwen2.5 architecture.
Training Details
The model's instruction-following was enhanced by merging rinna/qwen2.5-bakeneko-32b-instruct with rinna/qwq-bakeneko-32b using a Chat Vector addition process. This merged model was then refined using ORPO, trained on 1.3k curated samples generated by DeepSeek-R1.
Benchmarking Highlights
On Japanese MT-Bench (multi-turn), rinna/qwen2.5-bakeneko-32b-instruct-v2 scores 8.53, outperforming Qwen/Qwen2.5-32B-Instruct (7.54) and rinna/qwen2.5-bakeneko-32b-instruct (7.66), and closely matching rinna/qwq-bakeneko-32b (8.52).
Good For
- Applications requiring robust instruction following in Japanese.
- Tasks demanding strong reasoning capabilities in Japanese without the overhead of additional reasoning steps.
- Developers seeking a high-performance Japanese language model based on the Qwen2.5 family.