Aesdi90/DeepSeek-32B-Bare-Mind

TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Apr 3, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

Aesdi90/DeepSeek-32B-Bare-Mind is a 32.8 billion parameter distilled reasoning model developed by DeepSeek AI, based on the Qwen2.5-32B architecture. It is part of the DeepSeek-R1-Distill series, which leverages reasoning patterns from larger DeepSeek-R1 models. This model is specifically fine-tuned to excel in complex reasoning tasks across mathematics, code, and general problem-solving, demonstrating strong performance on benchmarks like AIME 2024 and MATH-500.

Loading preview...

DeepSeek-R1-Distill-Qwen-32B Overview

This model, Aesdi90/DeepSeek-32B-Bare-Mind, is a 32.8 billion parameter model from the DeepSeek-R1-Distill series, developed by DeepSeek AI. It is a distilled version of the larger DeepSeek-R1 models, which were trained using large-scale reinforcement learning (RL) to develop advanced reasoning capabilities without initial supervised fine-tuning (SFT).

Key Capabilities & Differentiators

  • Reasoning Performance: Leverages reasoning patterns discovered by the 671B parameter DeepSeek-R1 model, achieving strong performance in math, code, and general reasoning tasks.
  • Distillation Approach: Demonstrates that complex reasoning abilities from larger models can be effectively transferred to smaller, dense models, making high-quality reasoning more accessible.
  • Benchmark Achievements: Shows competitive results against larger models on benchmarks such as AIME 2024 (72.6% pass@1), MATH-500 (94.3% pass@1), and LiveCodeBench (57.2% pass@1).
  • Optimized for CoT: Designed to engage in thorough chain-of-thought (CoT) reasoning, with recommendations to enforce a <think> prefix for optimal performance.

When to Use This Model

  • Complex Reasoning: Ideal for applications requiring robust logical deduction, mathematical problem-solving, and code generation.
  • Resource-Efficient Reasoning: Suitable for scenarios where the reasoning power of larger models is needed, but with the computational efficiency of a 32.8B parameter model.
  • Benchmarking & Research: Recommended for researchers and developers looking to explore advanced reasoning capabilities in a distilled model, especially when comparing against other LLMs in math, code, and general intelligence tasks.