Name: Entrit/Mistral-7B-v0.3-trit-uniform-d1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Entrit

Overview

Entrit/Mistral-7B-v0.3-trit-uniform-d1 is a 7 billion parameter language model derived from mistralai/Mistral-7B-v0.3 through balanced ternary post-training quantization. This model, developed by Entrit Systems, implements a unique quantization scheme achieving 1.88 bits per weight at a depth of d=1 (3 levels per weight).

Key Quantization Details

This model utilizes the tritllm-codec (v2) for quantization, as detailed in the paper "Balanced Ternary Post-Training Quantization for Large Language Models" by Stentzel (2026). Key specifications include:

Source Model: mistralai/Mistral-7B-v0.3
Quantization Depth: d=1 (3 levels)
Bits per Weight: 1.88
Method: Uniform Post-Training Quantization (PTQ)
Quantized Layers: All 2D linear matrices are quantized.
FP16 Kept: lm_head, token embeddings, and all *_norm layers remain in FP16 for compatibility.

Unique Differentiator

While the weights are dequantized to FP16 for standard transformers compatibility (resulting in an on-disk size similar to the FP16 source), the 1.88-bpw figure highlights its information content. This makes it particularly relevant for specialized hardware that can directly process the packed trit format, enabling highly efficient inference. The model's design focuses on significant weight compression while maintaining performance, making it a strong candidate for resource-constrained environments or edge deployments when paired with compatible inference kernels.

Overview

Overview

Key Quantization Details

Unique Differentiator

Full Model Card (README)