DangIT02/qwen3vl-flowchart-to-mermaid_v3
DangIT02/qwen3vl-flowchart-to-mermaid_v3 is an 8 billion parameter vision-language model, fine-tuned from unsloth/Qwen3-VL-8B-Instruct, specifically designed to transcribe flowchart images into Mermaid code. This v3 iteration significantly improves transcription fidelity by fine-tuning the vision tower, achieving an 82.5% node_f1 score. It excels at converting visual flowcharts into structured, reproducible Mermaid syntax, making it ideal for automating diagram documentation and generation.
Loading preview...
Overview
DangIT02/qwen3vl-flowchart-to-mermaid_v3 is an 8 billion parameter vision-language model, built upon unsloth/Qwen3-VL-8B-Instruct, specialized in converting flowchart images into Mermaid code. This model, developed by DangIT02, focuses on accurately reproducing diagram nodes, edges, labels, and direction in Mermaid syntax.
Key Capabilities & Improvements
- Flowchart-to-Mermaid Transcription: Converts visual flowchart diagrams into valid Mermaid code.
- Enhanced Fidelity (v3): Unlike its predecessor, v3 fine-tunes the vision tower, leading to substantial improvements in transcription accuracy, particularly for smaller diagrams.
- Canonicalized Output: Generates Mermaid code with canonicalized node IDs (A, B, C, etc.) for deterministic output and tool-use compatibility, while preserving original node and edge labels.
- Performance: Achieves an overall
node_f1of 0.825, with significant gains on small diagrams (node_f1 of 0.872, a +0.366 improvement over v2). - High Parse Success: Boasts a 100% parse success rate for generated Mermaid code.
Limitations
- Direction Detection: Accuracy for diagram direction (
graph TD,BT,LR) degrades on very large flowcharts (20+ nodes). - Hallucination: May occasionally generate plausible-but-incorrect structures for extremely complex diagrams (25+ nodes, dense text).
- English-only Labels: Performance with non-English labels is untested.
- Max Output: Output is limited to approximately 2048 tokens, which may truncate very large flowcharts.