HKUST-DSAIL/GraphMind-LLAMA-3.1-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Aug 17, 2025License:mitArchitecture:Transformer0.0K Open Weights Cold

HKUST-DSAIL/GraphMind-LLAMA-3.1-8B is an 8 billion parameter Large Language Model from the GraphMind series, developed by HKUST-DSAIL. This model is built upon the Llama 3.1 architecture and has undergone continued pre-training on GraphPile, a 10.9 billion token dataset focused on Graph Problem Reasoning. It excels in generalized reasoning, showing significant improvements across mathematical, logical, code, and commonsense reasoning benchmarks, making it ideal for complex problem-solving and algorithmic tasks.

Loading preview...

GraphMind-LLAMA-3.1-8B: Enhanced Generalized Reasoning

GraphMind-LLAMA-3.1-8B is an 8 billion parameter Large Language Model from the GraphMind series, developed by HKUST-DSAIL. It is built upon the Llama 3.1 architecture and has been significantly enhanced through continued pre-training (CPT) on GraphPile, a specialized 10.9 billion token dataset. GraphPile is uniquely designed with Graph Problem Reasoning (GPR) data, including Chain-of-Thought (CoT), Program-of-Thought (PoT), and Trace-of-Execution (ToE) data, alongside real-world graph data.

Key Capabilities

  • Superior Generalized Reasoning: Demonstrates substantial improvements across various reasoning domains:
    • Mathematical Reasoning: up to 4.9% average improvement.
    • Logical Reasoning: 33.4% improvement.
    • Code Reasoning: 46.3% improvement.
    • Commonsense Reasoning: 7.8% improvement.
    • Multi-Hop QA: 10.3% improvement.
  • Exceptional Graph Problem Solving: Achieves an average improvement of 53.1% on graph problem reasoning tasks compared to baseline models.
  • Strong Transfer Learning: Reasoning skills acquired from graph problems effectively transfer to other domains, providing a robust foundation for fine-tuning.

Intended Use Cases

  • Complex Problem Solving: Ideal for tasks requiring sophisticated logical, mathematical, and algorithmic reasoning.
  • Algorithmic Reasoning & Code Generation: Particularly strong in graph-related algorithmic tasks.
  • Foundation for Fine-tuning: Serves as a powerful base model for further fine-tuning on reasoning-intensive downstream applications.

Limitations

  • The GraphPile dataset is currently limited to 23 distinct graph problem tasks, suggesting potential for further diversity.
  • As a reasoning-focused model, its performance on simpler, non-reasoning tasks like summarization or translation may be less optimal.