jmcinern/Qomhra-AWQ

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Sep 23, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

jmcinern/Qomhra-AWQ is an 8-billion parameter, activation-aware quantized version of Qomhrá, a bilingual Large Language Model (LLM) developed by researchers at Trinity College Dublin, University College Cork, and Queen's University Belfast. Adapted from Qwen3-8B, it is specifically designed to support the low-resource Irish language (Gaeilge) while maintaining strong English capabilities, offering a 32768 token context length. This model excels in Irish language understanding and generation, outperforming existing open-source baselines in benchmarks like Cloze-gle, SIB-gle, and IQA-gle/eng.

Loading preview...

Qomhrá-AWQ: A Bilingual Irish & English LLM

Qomhrá-AWQ is an 8-billion parameter, activation-aware quantized model based on Qomhrá, developed by researchers at Trinity College Dublin, University College Cork, and Queen's University Belfast. It is adapted from Qwen3-8B and specifically engineered to support the low-resource Irish language (Gaeilge) alongside English, aiming to provide an open-weight alternative for the Irish language community.

Key Capabilities

  • Bilingual Proficiency: Optimized for both Irish and English, maintaining strong English capabilities through a high mixture of English data during continued pre-training.
  • Irish Language Excellence: Outperforms existing open-source baselines in Irish understanding and generation benchmarks, including grammatical gender (Cloze-gle), topic modeling (SIB-gle), and question answering (IQA-gle).
  • Robust Training: Developed using a two-stage pipeline: Bilingual Continued Pre-Training (CPT) on a 3.265 billion character corpus (75% Irish, 25% English) and Instruction Tuning with a 30k sample parallel English-Irish dataset.

Good For

  • Applications requiring strong performance in Irish language processing.
  • Use cases demanding bilingual (Irish-English) text generation and understanding.
  • Developers seeking an open-source LLM for low-resource language support.