jphme/vicuna-13b-v1.3-ger: German-Optimized Vicuna
This model is a specialized version of LMSYS's 13 billion parameter Vicuna 13b v1.3, fine-tuned with an additional German language dataset. Its primary goal is to enhance proficiency in German text understanding and generation, making it more suitable for applications requiring German language interaction than the original Vicuna model.
Key Capabilities & Features
- German Language Optimization: Fine-tuned on an experimental German dataset to improve performance in German text processing.
- Factual Retrieval: Includes fine-tuning data specifically targeting factual retrieval, aiming to reduce hallucination and provide context-bound answers.
- Base Model: Built upon the Vicuna 13b v1.3 architecture, which was originally fine-tuned from LLaMA on ShareGPT conversations.
- Prompt Template: Utilizes a chat-based prompt template for conversational interactions.
Considerations & Limitations
- Experimental Dataset: The German fine-tuning was performed on a small, experimental dataset, and the model's capabilities are still under development.
- Multi-turn Inconsistencies: Potential issues with
<eos> tokens during fine-tuning preparation may lead to inconsistencies in multi-turn chat applications. - Limited Evaluation: Evaluation has been primarily based on small, handcrafted German test samples, showing improved German text generation over the base model in many situations.
Good For
- Developers and researchers working on German-centric natural language processing tasks.
- Applications requiring a large language model with enhanced German language understanding and generation capabilities.
- Use cases where factual retrieval from provided context in German is important.