maywell/kiqu-70b
maywell/kiqu-70b is a 69 billion parameter language model developed by maywell, fine-tuned from Miqu-70B-Alpaca-DPO using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) with Korean datasets. This model, based on an early leaked version of Mistral-Medium, is specifically optimized for high-quality Korean language generation and understanding. It features a 32K context length and follows the Mistral instruction format, making it suitable for various Korean NLP applications.
Loading preview...
kiqu-70b: A Korean-Optimized 70B Language Model
kiqu-70b is a 69 billion parameter language model developed by maywell, built upon the Miqu-70B-Alpaca-DPO base model. This model has undergone Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) using extensive Korean datasets, making it highly proficient in Korean language tasks. The base model, miqu-1-70b, is noted as an early leaked version of Mistral-Medium, which users should consider for commercial use.
Key Capabilities & Features
- Korean Language Specialization: Optimized for natural and clean responses in Korean through dedicated SFT and DPO training.
- Base Architecture: Derived from Miqu-70B-Alpaca-DPO, which itself is based on an early version of Mistral-Medium.
- Instruction Format: Adheres to the standard Mistral instruction format (
[INST] {instruction} [/INST] {output}). - Context Length: Supports a substantial context window of 32,768 tokens.
- Inference Optimization: Recommends avoiding trailing spaces after
[/INST]in the chat template for optimal performance during inference.
Licensing & Usage
The model itself follows the cc-by-sa-4.0 license. However, due to its base model (miqu-1-70b) being an early leaked version of Mistral-Medium, commercial use is at the user's own risk.
Recommended Use Cases
- Applications requiring high-quality Korean text generation.
- Korean-centric chatbots and conversational AI systems.
- Tasks benefiting from a large language model with strong Korean language understanding.