tablegpt/TableGPT2-7B

Warm
Public
7.6B
FP8
131072
Nov 1, 2024
License: apache-2.0
Hugging Face
Overview

TableGPT2-7B: Specialized for Tabular Data

TableGPT2-7B is a 7.6 billion parameter large language model developed by Zhejiang University, designed to bridge the gap between general LLM capabilities and the specific demands of tabular/structured data tasks. Built on the Qwen2.5 architecture, it features specialized encoding for tabular data, including a unique semantic encoder to interpret rows, columns, and entire tables.

Key Capabilities

  • Tabular Data Processing: Accepts tabular data structured as df.head() results, alongside text inputs.
  • Optimized Output: Generates text-based outputs specifically for coding tasks, data interpretation, and BI-focused question answering.
  • Enhanced Performance: Achieves significant performance increases (35.20% over comparable models on standard benchmarks and 49.32% on BI-focused assessments) in tabular comprehension, code generation, and structured data reasoning.
  • Multilingual Support: Primarily emphasizes Chinese corpora, with limited support for other languages.
  • Agent Integration: Recommended for use with the tablegpt-agent GitHub repository for complex usage scenarios and enhanced performance.

Good For

  • Business intelligence (BI) applications.
  • Automated data-driven analysis.
  • Tasks involving databases or data warehouses.
  • Natural language to SQL conversion.
  • Table understanding and question answering over structured data.