Name: NICKO/phi-4-BonfyreFPQ3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: NICKO

NICKO/phi-4-BonfyreFPQ3: Highly Compressed Language Model

NICKO/phi-4-BonfyreFPQ3 is a 14.7 billion parameter model utilizing the phi-4 architecture, developed by NICKO. Its core innovation lies in the BonfyreFPQ v9/v10 compression method, which employs a unique Bonfyre Weight Algebra to significantly reduce model size while preserving quality.

Key Compression Details

Decomposition: Weights are decomposed using truncated SVD (W = L + R).
Pruning: The 'R' component undergoes hybrid structure-aware pruning.
Correction: Curl and divergence energy correction are applied.
Encoding: FPQ v9 multi-scale encoding (LR + E8 + RVQ + QJL + Ghost) is used to achieve high compression.
Output Format: The model is provided in BF16 safetensors format, designed for direct loading without special loaders, making it a drop-in replacement for standard models.

Quality & Performance

The compression method achieves approximately 4 bits per weight.
It maintains a high per-weight cosine similarity of ~0.9999, indicating minimal quality degradation despite significant compression.
Verified benchmarks are available for review here.

When to Use This Model

This model is particularly suitable for applications requiring:

Efficient deployment: Its high compression ratio allows for reduced memory footprint and faster loading.
Resource-constrained environments: Ideal for scenarios where computational or storage resources are limited.
Standard integration: Its BF16 safetensors format ensures compatibility with existing PyTorch, diffusers, and HuggingFace workflows.

Overview

NICKO/phi-4-BonfyreFPQ3: Highly Compressed Language Model

Key Compression Details

Quality & Performance

When to Use This Model

Full Model Card (README)