stvlynn/Qwen-7B-Chat-Cantonese

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:32kPublished:May 4, 2024License:agpl-3.0Architecture:Transformer0.0K Open Weights Cold

Qwen-7B-Chat-Cantonese is a 7 billion parameter instruction-tuned causal language model developed by stvlynn, fine-tuned from Qwen-7B-Chat. This model is specifically trained on a substantial amount of Cantonese language data, making it highly proficient in Cantonese communication. With a context length of 32768 tokens, it is optimized for generating and understanding Cantonese text in chat-based applications.

Loading preview...

Qwen-7B-Chat-Cantonese: Cantonese-Optimized LLM

Qwen-7B-Chat-Cantonese is a specialized large language model developed by stvlynn, built upon the robust Qwen-7B-Chat architecture. Its primary distinction lies in its extensive fine-tuning with a significant volume of Cantonese language data, making it particularly adept at processing and generating Cantonese.

Key Capabilities

  • Cantonese Language Proficiency: Excels in understanding and generating natural Cantonese text, making it suitable for applications requiring native-level Cantonese interaction.
  • Instruction Following: Inherits instruction-following capabilities from its base model, Qwen-7B-Chat, allowing it to respond to diverse prompts.
  • Context Handling: Supports a substantial context length of 32768 tokens, enabling it to maintain coherence over longer conversations or documents.

Training Details

The model was fine-tuned using an AdamW optimizer with a learning rate of 7e-5 and a batch size of 1000. Training was conducted in fp16 precision over 1024 total steps, utilizing a cosine learning rate policy. Gradient accumulation was set to 8 steps.

Use Cases

This model is ideal for applications requiring strong performance in Cantonese, such as:

  • Cantonese-speaking chatbots and virtual assistants.
  • Content generation in Cantonese.
  • Language translation and localization tasks involving Cantonese.
  • Educational tools for Cantonese language learning.