Overview
kno10/ende-chat-0.0.4 is a 7 billion parameter LoRA finetune of the Mistral-7B-v0.1 model, developed by Erich Schubert. Its primary goal is to enhance German language capabilities through continued finetuning, while simultaneously preserving English proficiency. This dual-language focus allows the model to be effective for tasks such as translation between German and English, and answering questions in one language based on documents in the other.
Key Capabilities
- Bilingual Proficiency: Designed for high-quality text generation in both German and English.
- Chat Foundation: Intended as a base model for chat applications in German and English.
- Context Length: Supports an 8192 token context, enabling processing of longer inputs.
Training & Limitations
The model underwent LoRA finetuning using LLaMA-Factory with a mixture of English and German data, prioritizing data quality. Instruction finetuning was also applied using various datasets, including sharegpt-deutsch, oasst_de, and evol_instruct_de. Acknowledged limitations include insufficient compute resources during training, leading to heavy data subsampling, and the use of automatically translated English data for finetuning, which may impact quality. Benchmarks show performance below Mistral-7B-v0.1 but better than Mistral-7B-Instruct-v0.1, with the caveat that these benchmarks are English-only and do not reflect German improvements.
Use Cases
- Bilingual Chatbots: Ideal for conversational agents requiring fluency in both German and English.
- Translation Assistance: Can be used for generating translations or understanding cross-lingual queries.
- Bilingual Information Retrieval: Answering German questions based on English documents, and vice versa.