infly/INFLogic-Qwen2.5-32B-RL-Preview

TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Mar 27, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The infly/INFLogic-Qwen2.5-32B-RL-Preview is a 32.8 billion parameter language model developed by infly, fine-tuned from DeepSeek-R1-Distill-Qwen-32B. It specializes in logical reasoning tasks, achieving state-of-the-art performance among open-source LLMs on the ZebraLogicBench. This model leverages reinforcement learning with verifiable rewards (RLVR) on a proprietary logical reasoning dataset to enhance its problem-solving capabilities, making it suitable for complex analytical applications.

Loading preview...

Model Overview

INFLogic-Qwen2.5-32B-RL-Preview is a 32.8 billion parameter language model developed by infly, specifically designed to enhance logical reasoning. It is fine-tuned from the DeepSeek-R1-Distill-Qwen-32B base model using a proprietary logical reasoning dataset and a reinforcement learning with verifiable rewards (RLVR) approach.

Key Capabilities & Performance

  • Enhanced Logical Reasoning: Achieves state-of-the-art performance among open-source LLMs on the ZebraLogicBench as of March 27th, 2025.
  • Benchmark Scores:
    • ZebraLogic: 85.1 (significantly outperforming its base model DeepSeek-R1-Distill-Qwen-32B at 68.7).
    • MATH-500: 95.6
    • GPQA: 65.7
  • Fine-tuning Method: Utilizes reinforcement learning with verifiable rewards (RLVR) to improve reasoning abilities.

Good For

  • Applications requiring strong logical deduction and problem-solving.
  • Tasks involving complex analytical reasoning, such as those found in mathematical or puzzle-based challenges.
  • Researchers and developers focused on advancing AI capabilities in logical inference.