continuum-ai/qwen2.5-3b-general-forged

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 27, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The continuum-ai/qwen2.5-3b-general-forged model is a Qwen2.5-3B variant developed by continuum-ai, optimized through a pruning and retraining process. This model achieves a -0.4% perplexity improvement over its base while being 30% smaller due to head pruning. It is designed for general language tasks and is verifiable via the ForgeAlloy chain of custody, making it suitable for resource-constrained environments like MacBook Pro and mobile devices.

Loading preview...

Model Overview

The continuum-ai/qwen2.5-3b-general-forged model is a specialized version of the Qwen2.5-3B architecture, developed by continuum-ai. This model has undergone a unique optimization process involving 30% head pruning and subsequent retraining using Experiential Plasticity, resulting in a more compact yet efficient model.

Key Characteristics

  • Efficiency: Achieves a 30% reduction in parameters through head pruning while maintaining or improving performance.
  • Performance: Demonstrates a perplexity of 2.29, a -0.4% improvement over the base Qwen2.5-3B model's 2.30 perplexity.
  • Provenance: Features cryptographic provenance via the ForgeAlloy chain of custody, ensuring verifiable claims and model integrity.
  • Methodology: Developed through a prune \u2192 train pipeline over 3 cycles, detailed in the methodology paper.

Use Cases and Compatibility

This model is particularly well-suited for general language tasks where computational resources are a concern. Its optimized size allows it to run efficiently on a variety of devices, including:

  • MacBook Pro (16GB and 32GB RAM)
  • MacBook Air (16GB and 8GB RAM)
  • Mobile devices (iPhone / Android) using quantized formats like Q4_K_M.

Its verifiable chain of custody makes it a strong candidate for applications requiring high trust and transparency in model development.