ganchengguang/Yoko_13B_Japanese_QLoRA
Yoko_13B_Japanese_QLoRA is a 13 billion parameter language model developed by ganchengguang and contributed to by Yokohama National University Mori Lab. It is a QLoRA fine-tune of Llama-2-13b-chat-hf, specifically optimized for improved performance in Japanese and Chinese. The model was trained using the llm-japanese-dataset and additional chat and non-chat samples, making it suitable for conversational and general text generation tasks in these languages.
Loading preview...
Yoko_13B_Japanese_QLoRA Overview
Yoko_13B_Japanese_QLoRA is a 13 billion parameter language model, fine-tuned from the Llama-2-13b-chat-hf base model using the QLoRA method. This model was developed by ganchengguang, with contributions from Yokohama National University Mori Lab, and is specifically engineered to enhance performance in both Japanese and Chinese language tasks.
Key Capabilities
- Multilingual Performance: Demonstrates improved capabilities in generating and understanding text in Japanese and Chinese.
- Training Data: Fine-tuned on the
llm-japanese-dataset, supplemented by 50,000 chat samples and 280,000 non-chat samples, indicating a broad training base for diverse applications. - Base Model: Leverages the robust architecture of Llama-2-13b-chat-hf.
Recommended Usage
This model is well-suited for applications requiring strong Japanese and Chinese language processing. Recommended generation parameters for optimal output include a temperature between 0.5 and 0.7, top_p from 0.65 to 1.0, top_k between 30 and 50, and a repeat_penalty from 1.03 to 1.17.