Name: syntheticbot/ocr-qwen API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: syntheticbot

syntheticbot/ocr-qwen: Specialized OCR Vision-Language Model

syntheticbot/ocr-qwen is a 7 billion parameter vision-language model, fine-tuned from the robust Qwen/Qwen2.5-VL-7B-Instruct base, with a 32K context length. This model is specifically engineered for high-accuracy Optical Character Recognition (OCR) across a wide range of visual inputs, from structured documents to complex scene text.

Key Capabilities

Enhanced Text Recognition: Achieves superior accuracy in extracting text, adapting to diverse fonts, styles, sizes, and orientations.
Robust Document Handling: Designed to manage complexities like varied layouts, noise, and distortions commonly found in documents.
Structured Output: Capable of generating recognized text and layout information in structured formats such as JSON or CSV, particularly useful for invoices and tables.
Text Localization: Provides precise bounding box information for text elements within images.
Improved Visual Text Analysis: Maintains proficiency in analyzing charts and graphics, with enhanced recognition of embedded text.

Good for

Document Processing: Automating data extraction from scanned documents, PDFs, and images.
Invoice and Table Extraction: Converting visual tables and invoices into structured data formats.
Scene Text Recognition: Identifying and extracting text from real-world images and environments.
Automated Data Entry: Reducing manual effort in transcribing text from visual sources.
Content Analysis: Extracting textual information from charts, graphs, and other visual media for analytical purposes.

Overview

syntheticbot/ocr-qwen: Specialized OCR Vision-Language Model

Key Capabilities

Good for

Full Model Card (README)