Name: lang-uk/OmniGEC-Minimal-12B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: lang-uk

OmniGEC-Minimal-12B: Multilingual Grammatical Error Correction

OmniGEC-Minimal-12B is a 12 billion parameter model from lang-uk, built upon the Gemma-3-12B-IT architecture. It has been extensively instruction-tuned and supervised fine-tuned using the OmniGEC corpus, a silver-standard GEC dataset. This corpus integrates MultiGEC-25, Wikipedia, and Reddit edits across 11 European languages, including Czech, English, Estonian, German, Greek, Italian, Latvian, Slovenian, Swedish, and Ukrainian.

Key Capabilities

Paragraph-level Correction: Excels at correcting grammatical errors within entire paragraphs, not just individual sentences.
Multilingual Support: Provides robust GEC capabilities for 11 low- and mid-resource European languages.
State-of-the-Art Performance: Achieves SOTA results for paragraph-based editing in both minimal and fluency tracks, surpassing baseline models like LLaMA-3-8B by 9–10 GLEU points on the minimal track.
Enhanced for Specific Languages: Delivers the current best open scores for Estonian and Latvian on the MultiGEC-25 test set.

Training and Evaluation

The model was trained on a diverse dataset including WikiEdits-MultiGEC (human Wikipedia revisions), Reddit-MultiGEC (posts from language-specific subreddits with GPT-4o-mini corrections), and MultiGEC-25 golden shared-task data. Evaluation was performed using the GLEU metric via the official MultiGEC-25 CodaLab evaluator.

Good For

Automated Text Correction: Ideal for applications requiring high-quality grammatical error correction in multiple European languages.
Content Refinement: Useful for improving the fluency and correctness of written content in supported languages.
Research in GEC: Provides a strong baseline and SOTA performance for further research in multilingual GEC, particularly for low-resource languages.

Overview

OmniGEC-Minimal-12B: Multilingual Grammatical Error Correction

Key Capabilities

Training and Evaluation

Good For

Full Model Card (README)