Overview
Gemma-2-27B-Chinese-Chat is the inaugural instruction-tuned language model derived from Google's gemma-2-27b-it base model, tailored for both Chinese and English users. Developed by Shenzhi Wang, Yaowei Zheng, Guoyin Wang, Shiji Song, and Gao Huang, this 27.2 billion parameter model features an 8K context length.
Key Capabilities & Differentiators
This model is fine-tuned using the ORPO algorithm on a preference dataset exceeding 100K pairs. A primary focus during development was to address and significantly reduce the common issues of "Chinese questions with English answers" and general language mixing in responses, which were observed in the original google/gemma-2-27b-it.
Key enhancements and features include:
- Bilingual Optimization: Specifically designed to improve performance and coherence for Chinese and English users, minimizing language mixing.
- Enhanced Abilities: Demonstrates improved capabilities in roleplaying, tool-using, and mathematical problem-solving.
- Training Details: Fine-tuned with full parameters over 3 epochs, using a learning rate of 3e-6 and a cosine scheduler, with a global batch size of 128.
- Availability: Provided in BF16 format and various GGUF files (q4_k_m, q_4_0, q_8_0), with official Ollama support for easy deployment.
Use Cases
This model is particularly well-suited for applications requiring:
- Bilingual Chatbots: Where seamless switching and accurate responses in both Chinese and English are critical.
- Interactive Roleplay: Engaging in diverse character personas with improved consistency.
- Tool Integration: Utilizing external tools effectively for complex tasks.
- Mathematical Reasoning: Solving quantitative problems with enhanced accuracy.
Users should note that the model's identity is not fine-tuned, so direct inquiries about its origin or developer may yield random or inaccurate responses.