MKLLM-7B-Instruct Overview
MKLLM-7B-Instruct is a 7 billion parameter instruction-tuned Large Language Model developed by trajkovnikola, specifically designed for the Macedonian language. It is based on the Mistral-7B-v0.1 architecture and was further pretrained on a mixed corpus of Macedonian and English text, totaling approximately 300 million tokens over two epochs. This continued pretraining has resulted in a model highly capable of understanding and processing Macedonian.
Key Capabilities and Performance
- Macedonian Language Proficiency: Demonstrates strong capabilities in understanding and generating coherent Macedonian text.
- Instruction Following: Instruction-tuned using the chatml format, enabling effective conversational interactions.
- Benchmark Performance: Outperforms Meta's Llama3-8B-Instruct and Mistral's Mistral-7B-Instruct-v0.3 on Macedonian-translated benchmarks, particularly in understanding tasks. The developers also note superior generation capabilities and fluency in Macedonian.
- Base Model: Built on the robust Mistral-7B-v0.1 foundation.
Usage and Limitations
- Chat Template: Utilizes the chatml format for prompting, which can be applied using
tokenizer.apply_chat_template(). - Hallucination: Users should be aware that the model can hallucinate and produce factually incorrect output, especially concerning Macedonian-specific topics due to the relatively smaller training dataset for that language.