Maral-7B-alpha-1: A Persian-Focused LLM
Maral-7B-alpha-1 is a large language model developed by MaralGPT, specifically designed to specialize in the Persian language. Built upon the robust Mistral-7B-v0.1 architecture, this model was fine-tuned using an Alpaca Persian dataset, representing a significant effort to advance AI capabilities for Persian-speaking communities.
Key Capabilities & Features
- Persian Language Specialization: Primarily focused on generating responses in Persian, addressing a gap in LLMs for this language.
- Bilingual Support: While specialized in Persian, its Mistral base allows it to effectively produce English answers as well.
- Guanaco Prompt Format: Requires a specific
### Human: <prompt>\n### Assistant: <answer> format for optimal inference. - Quantization Options: Supports 4-bit quantization with a provided PEFT model and 8-bit quantization for deployment on smaller GPUs or consumer hardware.
Known Limitations
- Hallucinations: The model is noted for generating "extremely insane hallucinations," particularly with reasoning problems in Persian, and can produce misinforming answers. This is attributed to the current dataset and training procedures.
- Resource Intensive: Requires significant computational resources, though future GPTQ or GGUF versions are planned.
- Repetitive Output: May repeat itself; a temporary solution is to keep the generation temperature between 0.5 and 0.7.
Good For
- Persian Language Applications: Ideal for use cases requiring natural language understanding and generation in Persian.
- Bilingual Chatbots/Assistants: Can serve as a foundation for applications that need to interact in both Persian and English.
- Research and Development: Provides a base for further experimentation and improvement in Persian LLM development, especially concerning dataset quality and training methodologies.