maywell/kiqu-70b

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Feb 17, 2024License:cc-by-sa-4.0Architecture:Transformer0.0K Open Weights Cold

maywell/kiqu-70b is a 69 billion parameter language model developed by maywell, fine-tuned from Miqu-70B-Alpaca-DPO using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) with Korean datasets. This model, based on an early leaked version of Mistral-Medium, is specifically optimized for high-quality Korean language generation and understanding. It features a 32K context length and follows the Mistral instruction format, making it suitable for various Korean NLP applications.

Loading preview...

kiqu-70b: A Korean-Optimized 70B Language Model

kiqu-70b is a 69 billion parameter language model developed by maywell, built upon the Miqu-70B-Alpaca-DPO base model. This model has undergone Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) using extensive Korean datasets, making it highly proficient in Korean language tasks. The base model, miqu-1-70b, is noted as an early leaked version of Mistral-Medium, which users should consider for commercial use.

Key Capabilities & Features

  • Korean Language Specialization: Optimized for natural and clean responses in Korean through dedicated SFT and DPO training.
  • Base Architecture: Derived from Miqu-70B-Alpaca-DPO, which itself is based on an early version of Mistral-Medium.
  • Instruction Format: Adheres to the standard Mistral instruction format ([INST] {instruction} [/INST] {output}).
  • Context Length: Supports a substantial context window of 32,768 tokens.
  • Inference Optimization: Recommends avoiding trailing spaces after [/INST] in the chat template for optimal performance during inference.

Licensing & Usage

The model itself follows the cc-by-sa-4.0 license. However, due to its base model (miqu-1-70b) being an early leaked version of Mistral-Medium, commercial use is at the user's own risk.

Recommended Use Cases

  • Applications requiring high-quality Korean text generation.
  • Korean-centric chatbots and conversational AI systems.
  • Tasks benefiting from a large language model with strong Korean language understanding.