Name: AEON-7/DFlash-Qwen3.5-27B-Uncensored API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: AEON-7

DFlash Qwen3.5-27B Uncensored Overview

This model is a 27 billion parameter hybrid linear-attention model built on the Qwen3.5 architecture, released by AEON-7. It operates in BF16 full-precision and supports both vision and text modalities. A key differentiator is its integration of DFlash speculative decoding, which significantly boosts inference speed, particularly on memory-bandwidth-limited hardware like DGX Spark.

Key Capabilities & Features

Enhanced Inference Speed: DFlash speculative decoding achieves up to 2.7x speedup (33.2 tok/s single-stream) compared to the baseline 12.2 tok/s, by amortizing memory bandwidth costs across multiple tokens.
Dense Architecture Advantages: Unlike sparse Mixture-of-Experts (MoE) models, this dense 27B model ensures all parameters contribute to every token, leading to higher quality per FLOP, predictable latency, and simpler deployment.
Hybrid Attention: Features a unique architecture with 48 Gated Delta Network (GDN) layers for efficient long-context processing and 16 full-attention layers for global context capture, supporting a maximum context length of 131,072 tokens.
Multimodal (Vision + Text): Includes a 27-layer ViT vision encoder (460M parameters) for robust image understanding capabilities.
Uncensored: Created using an "abliteration" technique to remove safety alignment, offering a model with no built-in refusal behavior.

When to Use This Model

High-throughput applications: Ideal for scenarios requiring fast, responsive inference on dense models, especially on hardware where memory bandwidth is a bottleneck.
Applications needing multimodal understanding: Suitable for tasks that involve processing and generating responses based on both text and image inputs.
Research and development: Useful for exploring the capabilities of uncensored models or for applications where custom safety layers are preferred.

Overview

DFlash Qwen3.5-27B Uncensored Overview

Key Capabilities & Features

When to Use This Model

Full Model Card (README)