Winuim/qwen3-vl-8b-invoice-cpt

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 12, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The Winuim/qwen3-vl-8b-invoice-cpt is an 8 billion parameter Qwen3-VL model, developed by Winuim, fine-tuned for invoice processing tasks. This model leverages the Qwen3-VL architecture, providing vision-language capabilities. It was trained using Unsloth and Huggingface's TRL library, optimizing for faster fine-tuning. Its primary strength lies in specialized document understanding, particularly for invoices.

Loading preview...

Model Overview

The Winuim/qwen3-vl-8b-invoice-cpt is an 8 billion parameter vision-language model, developed by Winuim. It is a fine-tuned variant of the unsloth/qwen3-vl-8b-instruct-unsloth-bnb-4bit base model, specifically adapted for invoice processing. This model benefits from being trained with Unsloth and Huggingface's TRL library, which facilitates faster fine-tuning processes.

Key Capabilities

  • Vision-Language Understanding: Inherits the Qwen3-VL architecture, enabling it to process and understand both visual (e.g., invoice images) and textual information.
  • Specialized Invoice Processing: Fine-tuned to excel in tasks related to invoices, suggesting proficiency in extracting structured data from such documents.
  • Efficient Training: Utilizes Unsloth for accelerated fine-tuning, indicating potential for rapid adaptation to specific document types.

Good For

  • Automated data extraction from invoices.
  • Document understanding applications focused on financial records.
  • Use cases requiring a vision-language model with a strong specialization in invoice content.