zjunlp/DataMind-Analysis-Qwen2.5-7B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jul 19, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The DataMind-Analysis-Qwen2.5-7B model, developed by zjunlp, is a 7.6 billion parameter language model built on the Qwen2.5 architecture with a 32768 token context length. It is specifically fine-tuned to enhance data analysis capabilities, excelling in data understanding, code generation for analytical tasks, and strategic planning. This model addresses limitations of open-source LLMs in reasoning-intensive data analysis scenarios, making it suitable for automating complex analytical workflows.

Loading preview...

DataMind-Analysis-Qwen2.5-7B: Enhanced Data Analysis LLM

The DataMind-Analysis-Qwen2.5-7B is a 7.6 billion parameter language model developed by zjunlp, specifically designed to overcome the limitations of open-source LLMs in data analysis tasks. Presented in the paper "Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study," this model focuses on improving reasoning-intensive scenarios.

Key Capabilities

  • Optimized for Data Analysis: The model is fine-tuned to enhance performance across three core dimensions of data analysis: data understanding, code generation, and strategic planning.
  • Improved Strategic Planning: Research indicates that strategic planning quality is a primary determinant of performance, and this model leverages insights to boost this capability.
  • Robust Code Generation: Excels at generating Python code for data analysis tasks, as demonstrated by its usage examples for processing CSV files with pandas.
  • Data-Driven Enhancement: Developed using a data synthesis methodology informed by findings that data quality significantly impacts analytical reasoning.

Good For

  • Automating complex data analysis workflows.
  • Generating Python code for data manipulation and insights.
  • Applications requiring strong analytical reasoning and strategic planning in data contexts.
  • Researchers and developers looking for an open-source LLM specialized in data analysis tasks.