yasserrmd/kallamni-4b-v1
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kLicense:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

Kallamni-4B-v1 by yasserrmd is a 4 billion parameter conversational Arabic language model specifically fine-tuned for the Emirati dialect. It is designed to understand and generate natural spoken Emirati Arabic, capturing its unique vocabulary, phrasing, and emotional tone. This model excels at producing authentic daily UAE conversation, avoiding Modern Standard Arabic constructs, making it ideal for applications requiring highly localized Arabic interaction.

Loading preview...

Kallamni-4B-v1: Authentic Emirati Arabic Conversational Model

Kallamni-4B-v1 is a 4 billion parameter language model developed by yasserrmd, meticulously fine-tuned to specialize in natural spoken Emirati Arabic. Unlike general Arabic models, Kallamni-4B focuses on capturing the nuances of daily UAE dialect, including its specific vocabulary, phrasing, and cultural references, rather than Modern Standard Arabic (MSA).

Key Capabilities & Features

  • Authentic Emirati Dialect Generation: Designed to produce text that sounds like genuine daily UAE conversation, incorporating words like “وايد”, “هيه”, “سرت”, “عقب”, “الربع”, “القعدة”, “نغير جو”.
  • Conversational Fluidity: Builds upon previous versions (1.2B, 2.6B) to enhance dialect fidelity, consistency, and conversational flow.
  • Specialized Training Data: Fine-tuned on 58,000 synthetic Emirati conversation samples, manually filtered for dialect accuracy.
  • Tokenizer Extension: Includes Emirati-specific tokens to preserve dialect word merges, crucial for accurate representation.
  • High Human Evaluation Scores: Consistently rated by human evaluators as >90% authentic Emirati dialect.

Ideal Use Cases

  • Emirati-specific Chatbots: Creating conversational agents that interact naturally in the UAE dialect.
  • Content Generation: Producing dialogues, social media posts, or narratives that resonate with an Emirati audience.
  • Cultural Immersion Applications: Tools for learning or practicing authentic Emirati spoken Arabic.

Limitations & Ethical Use

Licensed under CC-BY-NC-4.0, the model does not collect personal user data. Users are advised to use it responsibly, avoiding the generation of misinformation, impersonation, or harmful content. Outputs published publicly should cite their AI-generated nature.