AbderrahmanSkiredj1/GemMaroc-27b-it
GemMaroc-27b-it by AbderrahmanSkiredj1 is a 27 billion parameter decoder-only Transformer model, based on Google's Gemma 3 architecture, with a context length of 2,048 tokens. It is specifically fine-tuned for Moroccan Darija proficiency using a minimal-data approach, while preserving strong cross-lingual reasoning abilities. This model excels at generating fluent Darija and English instructions, making it ideal for applications targeting the over 36 million speakers of Moroccan Arabic.
Loading preview...
GemMaroc-27B: Darija Proficiency with Green AI
GemMaroc-27B is a 27 billion parameter large language model developed by Abderrahman Skiredj, fine-tuned from Google's Gemma 3 architecture. Its primary goal is to unlock Moroccan Darija proficiency, addressing the underserved population of over 36 million Moroccan Arabic speakers. This model stands out for its "minimal-data, green-AI" training recipe, which efficiently adds fluent Darija generation while maintaining Gemma-27B's robust reasoning capabilities.
Key Capabilities
- Fluent Darija Generation: Specifically trained to understand and generate Moroccan Darija instructions.
- Cross-Lingual Reasoning: Preserves strong reasoning abilities from its Gemma 3 base, with 20% of training data kept in English for robustness.
- Efficient Training: Achieves high Darija competence with a significantly lower compute budget (48 GPU·h) compared to similar models, emphasizing a "quality-over-quantity" approach to data.
- Instruction Following: Supervised fine-tuning on 50K high-quality Darija/English instructions.
Good For
- Inclusive AI Applications: Developing LLM-powered tools and services for Moroccan Arabic speakers.
- Reasoning Tasks: Leveraging its strong reasoning foundation for complex problem-solving.
- Resource-Efficient Deployment: Suitable for scenarios where energy consumption and training costs are a concern.
- Multilingual Chatbots: Creating conversational agents that can fluently interact in both Darija and English.
Benchmark Highlights
GemMaroc-27B demonstrates competitive performance against Atlas-Chat-27B, achieving 60.5% on Darija HellaSwag and 84.2% on GSM8K @5, indicating strong reasoning and language understanding in both Darija and English contexts.