Overview
VisCoder2-7B: Multi-Language Visualization Coding Model
VisCoder2-7B, developed by TIGER-Lab, is a specialized model for generating executable visualization code across multiple programming languages. Built upon the Qwen2.5-Coder-7B-Instruct base model, it focuses on creating code that not only executes successfully but also produces visually accurate and semantically consistent outputs from natural language instructions.
Key Capabilities & Features
- Multi-Language Support: Trained on the extensive VisCode-Multi-679K dataset, which includes 679,000 instruction-tuning examples across 12 programming languages for visualization tasks.
- Executable Code Generation: Designed to generate code that is directly executable, addressing a core challenge in visualization.
- Rendering & Self-Debugging: Incorporates capabilities for rendering visualization code and performing iterative self-debugging to refine outputs.
- Performance on VisPlotBench: Evaluated on VisPlotBench, a benchmark of 888 executable visualization tasks across 8 languages, demonstrating consistent performance and notable improvements with multi-round self-debugging.
Training Details
- Base Model: Qwen2.5-Coder-7B-Instruct.
- Tuning Method: Full-parameter supervised fine-tuning (SFT).
- Dataset: VisCode-Multi-679K, specifically curated for multi-language executable visualization.
Ideal Use Cases
- Automated Visualization: Generating visualization code from natural language descriptions.
- Multi-Language Development: Projects requiring visualization code generation in various programming environments.
- Iterative Code Refinement: Scenarios benefiting from self-debugging capabilities to improve visualization outputs.