TIGER-Lab/VisCoder2-7B

Warm
Public
7.6B
FP8
131072
License: apache-2.0
Hugging Face
Overview

VisCoder2-7B: Multi-Language Visualization Coding Model

VisCoder2-7B, developed by TIGER-Lab, is a specialized model for generating executable visualization code across multiple programming languages. Built upon the Qwen2.5-Coder-7B-Instruct base model, it focuses on creating code that not only executes successfully but also produces visually accurate and semantically consistent outputs from natural language instructions.

Key Capabilities & Features

  • Multi-Language Support: Trained on the extensive VisCode-Multi-679K dataset, which includes 679,000 instruction-tuning examples across 12 programming languages for visualization tasks.
  • Executable Code Generation: Designed to generate code that is directly executable, addressing a core challenge in visualization.
  • Rendering & Self-Debugging: Incorporates capabilities for rendering visualization code and performing iterative self-debugging to refine outputs.
  • Performance on VisPlotBench: Evaluated on VisPlotBench, a benchmark of 888 executable visualization tasks across 8 languages, demonstrating consistent performance and notable improvements with multi-round self-debugging.

Training Details

  • Base Model: Qwen2.5-Coder-7B-Instruct.
  • Tuning Method: Full-parameter supervised fine-tuning (SFT).
  • Dataset: VisCode-Multi-679K, specifically curated for multi-language executable visualization.

Ideal Use Cases

  • Automated Visualization: Generating visualization code from natural language descriptions.
  • Multi-Language Development: Projects requiring visualization code generation in various programming environments.
  • Iterative Code Refinement: Scenarios benefiting from self-debugging capabilities to improve visualization outputs.