MaralGPT/Maral-7B-alpha-1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 24, 2023License:mitArchitecture:Transformer0.1K Open Weights Cold

MaralGPT/Maral-7B-alpha-1 is a 7 billion parameter large language model developed by MaralGPT, specializing in the Persian language. Based on the Mistral architecture and fine-tuned on an Alpaca Persian dataset, this model aims to enhance AI capabilities for Persian speakers. It is capable of generating both Persian and English responses, making it suitable for bilingual applications with a focus on Persian language tasks.

Loading preview...

Maral-7B-alpha-1: A Persian-Focused LLM

Maral-7B-alpha-1 is a large language model developed by MaralGPT, specifically designed to specialize in the Persian language. Built upon the robust Mistral-7B-v0.1 architecture, this model was fine-tuned using an Alpaca Persian dataset, representing a significant effort to advance AI capabilities for Persian-speaking communities.

Key Capabilities & Features

  • Persian Language Specialization: Primarily focused on generating responses in Persian, addressing a gap in LLMs for this language.
  • Bilingual Support: While specialized in Persian, its Mistral base allows it to effectively produce English answers as well.
  • Guanaco Prompt Format: Requires a specific ### Human: <prompt>\n### Assistant: <answer> format for optimal inference.
  • Quantization Options: Supports 4-bit quantization with a provided PEFT model and 8-bit quantization for deployment on smaller GPUs or consumer hardware.

Known Limitations

  • Hallucinations: The model is noted for generating "extremely insane hallucinations," particularly with reasoning problems in Persian, and can produce misinforming answers. This is attributed to the current dataset and training procedures.
  • Resource Intensive: Requires significant computational resources, though future GPTQ or GGUF versions are planned.
  • Repetitive Output: May repeat itself; a temporary solution is to keep the generation temperature between 0.5 and 0.7.

Good For

  • Persian Language Applications: Ideal for use cases requiring natural language understanding and generation in Persian.
  • Bilingual Chatbots/Assistants: Can serve as a foundation for applications that need to interact in both Persian and English.
  • Research and Development: Provides a base for further experimentation and improvement in Persian LLM development, especially concerning dataset quality and training methodologies.