shenzhi-wang/Mistral-7B-v0.3-Chinese-Chat

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:May 25, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Mistral-7B-v0.3-Chinese-Chat by Shenzhi Wang is a 7.25 billion parameter instruction-tuned language model based on Mistral-7B-Instruct-v0.3, specifically fine-tuned for both Chinese and English users. It significantly improves Chinese language capabilities and reduces issues of mixed-language responses compared to its base model. This model excels in mathematics, roleplay, and tool use, making it suitable for diverse conversational AI applications.

Loading preview...

Overview

Mistral-7B-v0.3-Chinese-Chat is a 7.25 billion parameter instruction-tuned language model developed by Shenzhi Wang and team. It is built upon the mistralai/Mistral-7B-Instruct-v0.3 base model and has been full-parameter fine-tuned on a mixed Chinese-English dataset of approximately 100K preference pairs using the ORPO algorithm. This fine-tuning process has significantly enhanced its Chinese language abilities and reduced instances of mixed Chinese and English responses, a common issue in general-purpose models.

Key Capabilities

  • Enhanced Bilingual Performance: Specifically optimized for both Chinese and English users, addressing common cross-lingual response issues.
  • Diverse Functionality: Demonstrates strong performance in areas such as mathematics, roleplay, and tool use.
  • Instruction Following: Instruction-tuned to follow user commands effectively across various tasks.
  • GGUF Versions Available: Official q4, q8, and f16 GGUF quantized versions are provided for efficient deployment.

Good For

  • Bilingual Chatbots: Ideal for applications requiring robust conversational abilities in both Chinese and English.
  • Role-playing Scenarios: Capable of engaging in detailed role-play interactions.
  • Mathematical Problem Solving: Shows good performance in solving mathematical problems.
  • Tool Use/Function Calling: Designed to effectively utilize external tools based on instructions.
  • Developers seeking a Mistral-based model with strong Chinese support.