FreedomIntelligence/openPangu-Embedded-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:32kPublished:Aug 21, 2025License:openpangu-model-license-agreement-version-1.0Architecture:Transformer0.0K Open Weights Cold

The FreedomIntelligence openPangu-Embedded-7B is a 7 billion parameter large language model, developed by FreedomIntelligence, trained from scratch on approximately 19T tokens using Ascend NPUs. It features a dense architecture with GQA attention and a 32k native context length. This model is designed to integrate both fast and slow thinking, excelling in general reasoning, mathematics, and coding benchmarks.

Loading preview...

openPangu-Embedded-7B: An Efficient 7B LLM with Fast and Slow Thinking

The openPangu-Embedded-7B is a 7 billion parameter large language model developed by FreedomIntelligence, distinguished by its training from scratch on Ascend NPUs. It incorporates a dense architecture with Grouped Query Attention (GQA) and a substantial native context length of 32,768 tokens, having been pre-trained on approximately 19 trillion tokens.

Key Capabilities & Features

  • Dual Thinking Modes: The model is designed to integrate both "fast thinking" and "slow thinking" modes. Users can switch to fast thinking by appending /no_think to their input, which results in an empty thinking_content output.
  • Robust Performance: Achieves strong results across various benchmarks, including:
    • General Reasoning: MMLU-Pro (76.32), CMMLU (75.59), C-Eval (83.05), GPQA-Diamond (70.54).
    • Mathematics: MATH-500 (95.00), AIME24 (71.57), AIME25 (58.24).
    • Coding: LiveCodeBench (54.04), MBPP+ (76.06).
  • Efficient Architecture: Utilizes a 34-layer dense architecture with 12800 hidden dimensions and a 153k vocabulary size.

When to Use This Model

This model is particularly well-suited for applications requiring a balance of general intelligence, mathematical problem-solving, and coding capabilities within a 7B parameter budget. Its unique fast/slow thinking mechanism could be beneficial for tasks where both quick responses and more deliberate, reasoned outputs are desired. The model's training on Ascend NPUs suggests potential optimization for environments leveraging Huawei's AI hardware.