Name: twnlp/ChineseErrorCorrector4-4B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: twnlp

ChineseErrorCorrector4-4B: Precision CGEC and CSC

ChineseErrorCorrector4-4B (CSRP) is a 4-billion parameter model developed by Wei Tian, Yuhao Zhou, and Man Lan, specifically engineered for high-precision Chinese Grammatical Error Correction (CGEC) and Chinese Spelling Check (CSC). It was presented at ACL 2026 and achieves new state-of-the-art results on key benchmarks.

Key Capabilities & Differentiators

Addresses Over-Correction Bias: Traditional LLMs often over-correct correct text. CSRP resolves this through a unique three-stage training framework:
- Balanced Continued Pre-training (CPT): Internalizes linguistic priors using 5.9M samples (8:2 general to correction-specific data).
- Rationale-Augmented SFT: Distills Chain-of-Thought reasoning to guide error diagnosis.
- Efficiency-Aware Policy Alignment: Uses GRPO with a novel Efficiency-Aware Reward (EAR) to penalize unnecessary edits and reward precise corrections.
State-of-the-Art Performance:
- Achieves an $F_{0.5}$ of 50.99 on the NACGEC benchmark for Chinese Grammatical Error Correction, surpassing previous specialized large models.
- Attains a Correction F1 of 59.61 on the CSCD benchmark for Chinese Spelling Check, outperforming GPT-4 (Few-shot).

When to Use This Model

High-accuracy Chinese text correction: Ideal for applications requiring precise identification and correction of grammatical and spelling errors in Chinese.
Minimizing false positives: Its design specifically targets and reduces the common issue of over-correction in LLM-based systems.
Research and development in CGEC/CSC: Provides a strong baseline and advanced methodology for further research in Chinese text correction.

Overview

ChineseErrorCorrector4-4B: Precision CGEC and CSC

Key Capabilities & Differentiators

When to Use This Model

Full Model Card (README)