gdinexus/Nexus-Lumina-3B-v3

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Apr 17, 2026License:qwen-researchArchitecture:Transformer Cold

gdinexus/Nexus-Lumina-3B-v3 is a 3.1 billion parameter local reasoning engine developed by GDI Nexus, fine-tuned from the Qwen2.5-3B base model. It is specifically engineered for Apple Silicon using the MLX framework, featuring an asynchronous self-healing KV cache for memory stability. This model excels at reasoning tasks, achieving 56.83% on the ARC-Challenge benchmark, making it suitable for enterprise-grade local reasoning on fanless hardware.

Loading preview...

Overview

Nexus-Lumina-3B-v3 is a 3.1 billion parameter model developed by GDI Nexus, designed as a highly optimized local reasoning engine. It is fine-tuned from the Qwen2.5-3B base model, specifically targeting reasoning capabilities while mitigating instruction-tuning degradation.

Key Capabilities

  • Optimized for Reasoning: Fine-tuned on 17,636 filtered reasoning chains from the OpenHermes dataset to enhance logical processing.
  • Apple Silicon Integration: Engineered for optimal performance on Apple Silicon hardware, leveraging the MLX framework.
  • Memory Stability: Incorporates a proprietary mlx-ash-kv (Asynchronous Self-Healing KV Cache) to ensure maximum memory stability, especially with extended context lengths.
  • Local Evaluation: Achieved a normalized accuracy of 56.83% on the ARC-Challenge (25-shot) benchmark, evaluated entirely locally on unified memory.

Good For

  • Local Reasoning Tasks: Ideal for applications requiring robust reasoning capabilities on local devices.
  • Apple Silicon Deployments: Specifically designed to maximize performance and efficiency on Apple Silicon-powered hardware.
  • Enterprise-Grade Local AI: Suitable for enterprise use cases where powerful, local reasoning is required on fanless hardware.