junaidali/qwenadapters

TEXT GENERATIONConcurrency Cost:3Model Size:35.1BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 26, 2026Architecture:Transformer Cold

The junaidali/qwenadapters model is a 35.1 billion parameter Qwen3.5-MoE architecture, specifically a merged version of Qwen/Qwen-AgentWorld-35B-A3B with a v130 CPA LoRA adapter. This model integrates a rank-16 LoRA adapter, trained with MLX-LM, covering the top 12 decoder layers to enhance its capabilities. It is designed for general causal language modeling tasks, leveraging its Mixture-of-Experts (MoE) structure for efficient processing.

Loading preview...

Overview

This model, junaidali/qwenadapters, is a specialized version of the Qwen/Qwen-AgentWorld-35B-A3B base model, which is a Qwen3.5-MoE architecture with 35 billion total parameters (A3B active). Its key differentiator is the integration of a v130 CPA LoRA adapter, which has been merged directly into the weights (full bf16 precision).

Key Characteristics

  • Base Model: Qwen3.5-MoE, 35B total parameters / A3B active.
  • Adapter Details: A rank-16 LoRA adapter (qwen-cpa-v130-pretrain-fix) trained with MLX-LM over 3300 iterations.
  • Coverage: The adapter specifically targets and enhances the top 12 decoder layers (28–39) of the base model.
  • Merged Modules: Deltas from the adapter were folded into various critical components, including full-attention layers (self_attn.{q,k,v,o}_proj), linear-attention layers, MoE routing, shared expert components, and fused routed experts across all 12 affected layers.

Usage

This model is ready to be loaded directly with the transformers library, supporting bfloat16 precision and automatic device mapping, making it straightforward for deployment in existing Hugging Face ecosystems.