Malum0x/mlp-surgery-restored-top30

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:May 7, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Malum0x/mlp-surgery-restored-top30 is a 3.1 billion parameter Qwen2.5-3B-Instruct model that has undergone 'MLP surgery' to restore reasoning capabilities. Developed by Malum0x, this model was created by copying the top-30 most-damaged MLP layers from the base model back into a fine-tuned version that had degraded reasoning. It demonstrates improved performance on GSM8K and ARC Challenge benchmarks, achieving 64.29% on GSM8K and 48.55% on ARC Challenge, surpassing the base model in some metrics. This model highlights a novel method for recovering lost capabilities in fine-tuned LLMs without retraining.

Loading preview...

Overview

Malum0x/mlp-surgery-restored-top30 is a 3.1 billion parameter Qwen2.5-3B-Instruct model that showcases a unique 'MLP surgery' technique to restore reasoning abilities. This model was derived from a Qwen2.5-3B-Instruct base model that was fine-tuned on perplexity-filtered OpenHermes 2.5, which inadvertently damaged its reasoning performance.

Key Capabilities & Method

This model's primary innovation lies in its restoration method:

  • MLP Layer Restoration: It identifies and replaces the top-30 most-damaged MLP layers from the fine-tuned model with those from the original base model.
  • No Retraining: The restoration is achieved purely through weight surgery, without any additional retraining.
  • Improved Reasoning: This process successfully recovers and even surpasses the base model's performance on reasoning tasks.

Performance Highlights

Evaluated using lm-eval with GSM8K flexible-extract 5-shot and ARC Challenge acc_norm 0-shot, the model demonstrates significant improvements:

  • GSM8K: Achieves 64.29%, outperforming the base model (63.15%) and the 'broken' fine-tuned model (61.64%).
  • ARC Challenge: Reaches 48.55%, fully recovering and slightly exceeding the base model's score (48.12%).

Use Cases

This model is particularly relevant for:

  • Researchers exploring methods for model repair and capability restoration.
  • Developers interested in efficient ways to mitigate performance degradation after fine-tuning without extensive retraining.
  • Applications requiring robust reasoning capabilities from a 3.1B parameter model.