sukhrobnurali/qwen3vl-resume-parser

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Feb 16, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The sukhrobnurali/qwen3vl-resume-parser is an 8 billion parameter QLoRA fine-tune of Qwen/Qwen3-VL-8B-Instruct, developed by Sukhrob Nurali. This vision-language model is specifically optimized to parse resume/CV page images and extract information into a fixed 23-field JSON record. It offers a reduced VRAM footprint (~23 GB BF16) compared to larger models, making it suitable for structured data extraction from resumes in recruiting pipelines.

Loading preview...

Model Overview

sukhrobnurali/qwen3vl-resume-parser is an 8 billion parameter QLoRA fine-tune of the Qwen/Qwen3-VL-8B-Instruct vision-language model, developed by Sukhrob Nurali. It was created as an internal project at Corporate Solutions Group to provide a more efficient resume parsing solution. The model is published as merged full weights (BF16 safetensors), loading like a standard Qwen3-VL checkpoint without requiring adapter attachment.

Key Capabilities

  • Resume-to-JSON Extraction: Specialized in converting resume/CV page images into a structured 23-field JSON record, including identity, contact, skills, experiences, and education.
  • Optimized Schema: The 23-field schema and specific formatting rules are baked into the model's weights, simplifying prompts for structured output.
  • Reduced VRAM Footprint: Operates with approximately 23 GB VRAM in BF16 at 16K context, significantly less than the 50 GB required by the 32B parameter model it replaces.
  • Performance: Achieves an 83.9% weighted score and 88.2% unweighted score on a 51-sample held-out evaluation set, with 88.2% JSON validity.

When to Use This Model

  • Structured Resume Data Extraction: Ideal for extracting specific, predefined data points from resume images for recruiting or ATS pipelines.
  • Cost-Effective Parsing: Suitable when aiming to reduce GPU costs for resume parsing while maintaining parsing quality.
  • Batch Processing: Designed for batch or offline processing due to an average inference time of ~92.0 seconds per resume on an A100.

Limitations

  • Domain and Language Skew: Primarily trained on English, IT/software-centric resumes; performance may degrade on non-technical, unusual layouts, or non-English documents.
  • Schema Lock-in: The model is tightly coupled to its specific 23-field schema and enum vocabularies, which may not align with different downstream requirements.
  • JSON Validity: Approximately 12% of outputs may be invalid JSON, requiring defensive parsing in downstream applications.