OCRonos is an 8 billion parameter language model developed by PleIAs, based on the Llama-3 architecture, specifically trained for the correction of badly digitized texts. It excels at rectifying OCR errors, word segmentation issues, and overall broken text structures in documents from cultural heritage and financial/administrative sources. This model is a specialized tool within the Bad Data Toolbox, designed to make challenging, deteriorated text resources usable for LLM applications and search retrieval.
No reviews yet. Be the first to review!