reglab-rrc/mistral-rrc: Specialized Racial Covenant Detection Model
This model, developed by reglab-rrc, is a fine-tuned Mistral 7B causal language model specifically engineered for the identification and extraction of racial covenants from property deeds. It is an open-source tool, licensed under MIT, aimed at assisting legal reform by streamlining the process of locating discriminatory clauses in historical documents.
Key Capabilities
- Racial Covenant Detection: Accurately determines the presence of racial covenants in property deed text.
- Text Extraction: Pinpoints and extracts the exact raw text of detected covenants.
- Text Correction: Provides a corrected quotation of the covenant text, fixing spelling and formatting errors.
- High Performance: Achieves a precision of 1.000, recall of 0.994, and an F1 score of 0.997 on page-level detection, with a BLEU score of 0.932 for span-level accuracy.
- Legal Document Processing: Optimized for the unique language and structure of legal documents, particularly real property deeds.
Intended Use and Differentiators
This model is designed to aid jurisdictions and legal professionals in identifying racial covenants for removal or redaction, significantly reducing the manual effort required. It was trained on 3,801 annotated deed pages from eight U.S. counties, including Santa Clara County (CA), and has demonstrated superior performance compared to keyword matching and general-purpose LLMs like GPT models in this specific task. While robust, users should be aware of limitations regarding generalizability across vastly different jurisdictions, sensitivity to extreme OCR artifacts, and contextual ambiguity, necessitating human oversight for final verification. The project emphasizes ethical considerations, including the preservation of historical memory and accountability through legal review, and promotes accessibility by being an open-source solution.
For more details, refer to the accompanying paper: AI for Scaling Legal Reform: Mapping and Redacting Racial Covenants in Santa Clara County.