shenzhi-wang/Gemma-2-9B-Chinese-Chat
Gemma-2-9B-Chinese-Chat is a 9.24 billion parameter instruction-tuned language model developed by Shenzhi Wang and Yaowei Zheng, built upon Google's Gemma-2-9b-it. It is the first model of its kind specifically fine-tuned for Chinese and English users, excelling in roleplaying, tool-using, and mathematical tasks. The model significantly reduces issues of mixed-language responses and enhances performance compared to its base model, utilizing a 8192 token context length.
Loading preview...
Overview
Gemma-2-9B-Chinese-Chat is a 9.24 billion parameter instruction-tuned language model, developed by Shenzhi Wang and Yaowei Zheng, based on the google/gemma-2-9b-it architecture. It is notable as the first instruction-tuned model built upon Gemma-2-9b-it specifically for Chinese and English users, offering a context length of 8192 tokens.
Key Capabilities & Differentiators
- Bilingual Optimization: This model addresses and significantly reduces the common issue of "Chinese questions with English answers" or mixed-language responses often seen in other models, providing more coherent Chinese and English outputs.
- Enhanced Performance: Compared to the original
google/gemma-2-9b-it, this model demonstrates enhanced performance in several key areas:- Roleplaying: Capable of engaging in various roleplay scenarios.
- Tool-using: Supports function calling for integration with external tools.
- Mathematics: Improved ability to solve mathematical problems.
- Training Methodology: Fine-tuned using the ORPO algorithm with a preference dataset containing over 100K preference pairs, indicating a focus on aligning with user preferences.
- Accessibility: Provides various GGUF files for local deployment and an official Ollama model for quick use, alongside an online demo for easy testing.
Ideal Use Cases
- Bilingual Chatbots: Excellent for applications requiring robust conversational AI in both Chinese and English.
- Roleplaying Applications: Suitable for creative writing, interactive storytelling, and character-based interactions.
- Tool-Augmented Systems: Can be integrated into systems that leverage external tools or APIs for enhanced functionality.
- Educational & Mathematical Tools: Useful for tasks involving problem-solving and mathematical reasoning in a bilingual context.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.