RUCKBReasoning/TableLLM-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 6, 2024License:llama2Architecture:Transformer0.0K Open Weights Cold

TableLLM-7b by RUCKBReasoning is a 7 billion parameter large language model fine-tuned from CodeLlama-7b-Instruct-hf, specifically designed for tabular data manipulation tasks. It generates either Python code solutions for spreadsheet-embedded data (insert, delete, update, query, merge, plot) or direct text answers for document-embedded short tables. This model excels in real office usage scenarios, demonstrating strong performance across various tabular data benchmarks.

Loading preview...

TableLLM-7b: Tabular Data Manipulation for Office Scenarios

TableLLM-7b, developed by RUCKBReasoning, is a 7 billion parameter language model fine-tuned from CodeLlama-7b-Instruct-hf. Its primary purpose is to efficiently handle tabular data manipulation tasks found in real office environments, whether the data is embedded in spreadsheets or documents. The model is capable of generating two types of outputs based on the scenario:

Key Capabilities

  • Code Generation: For spreadsheet-embedded tabular data, TableLLM-7b generates Python code solutions to perform operations such as inserting, deleting, updating, querying, merging, and plotting tables.
  • Text Generation: For document-embedded short tables, it provides direct text answers to queries.

Performance Highlights

TableLLM-7b has been evaluated on a range of benchmarks for both code and text generation. It achieves notable scores, including 86.6 on WikiSQL, 82.6 on Spider, and 78.8 on a self-created table operation benchmark for code solution generation. For text answer generation, it scores 58.8 on WikiTQ, 66.9 on TAT-QA, 72.6 on FeTaQA, and 63.1 on OTTQA. Overall, TableLLM-7b demonstrates strong performance, often outperforming other specialized models and competitive with larger general-purpose LLMs like GPT-3.5 in specific tabular tasks.

Prompting

The model utilizes distinct prompt templates for code and text generation. Code solution prompts include CSV data headers and a question, while text answer prompts provide table text, the table in CSV format, and the question to be answered.