nassimjp/Maral-7B-alpha-1
nassimjp/Maral-7B-alpha-1 is a 7 billion parameter large language model developed by Muhammadreza Haghiri and Mahi Mohrechi, based on the Mistral architecture with a 4096 token context length. It specializes in the Persian language, having been trained on an Alpaca Persian dataset, while also retaining English language capabilities from its Mistral base. This model is designed to advance Persian language processing in AI, offering instruct-following capabilities for various applications.
Loading preview...
Maral-7B-alpha-1: A Persian-Focused LLM
Maral-7B-alpha-1 is a 7 billion parameter large language model built upon the Mistral architecture, specifically fine-tuned for the Persian language. Developed by Muhammadreza Haghiri and Mahi Mohrechi, this model represents a significant effort to enhance AI capabilities for Persian speakers.
Key Capabilities and Features
- Persian Language Specialization: Primarily trained on an Alpaca Persian dataset, making it proficient in generating and understanding Persian text.
- Multilingual Support: Inherits English language capabilities from its Mistral base, allowing for responses in both Persian and English.
- Instruct-Following: Designed to follow instructions using the Guanaco prompt format (
### Human: <prompt>\n### Assistant: <answer>). - Resource Optimization: Supports 4-bit and 8-bit quantization for inference on consumer hardware or smaller GPUs, with options for PEFT and
bitsandbytesintegration.
Known Issues and Considerations
- Hallucinations: The model can produce highly imaginative but potentially misinforming answers, particularly with reasoning tasks in Persian. This is an area for future improvement through better datasets and training (e.g., DPO).
- Repetitive Output: May generate repetitive text; a temporary solution is to keep the generation temperature between 0.5 and 0.7.
- Grammar Quality: While capable of GPT-3.5 level grammar in Persian, it may still exhibit issues that can be addressed with further training.
Use Cases
This model is particularly well-suited for applications requiring natural language understanding and generation in Persian, such as chatbots, content creation, and language translation, especially where a balance between Persian and English capabilities is beneficial.