Name: Entrit/Mistral-7B-v0.3-trit-uniform-d4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Entrit

Overview

Entrit/Mistral-7B-v0.3-trit-uniform-d4 is a 7 billion parameter language model derived from mistralai/Mistral-7B-v0.3. Developed by Entrit Systems, this model utilizes a balanced ternary post-training quantization (PTQ) method, as detailed in the paper "Balanced Ternary Post-Training Quantization for Large Language Models" (Stentzel, 2026).

Key Quantization Details

This model employs a depth of d=4, meaning each weight has 81 levels, achieving an information content of 6.64 bits per weight. The quantization process uses a uniform PTQ method with a group size of 16 and a 27-entry log-spaced scale codebook. While the weights are dequantized to FP16 for compatibility with standard transformers libraries, the true efficiency gain is realized on hardware that can directly process the packed trit format, leveraging the significantly reduced information content.

Technical Specifications

Source Model: mistralai/Mistral-7B-v0.3
Quantization Depth: d=4 (81 levels per weight)
Bits per Weight: 6.64
Quantized Layers: All 2D linear matrices
FP16 Kept: lm_head, token embeddings, and all *_norm layers
Codec: tritllm v2

Use Cases

This model is particularly suitable for applications requiring highly efficient inference where memory footprint and computational cost are critical, especially when deployed on specialized hardware designed to consume packed trit formats. It offers a significantly compressed version of the Mistral-7B-v0.3 model while maintaining its core capabilities.

Overview

Overview

Key Quantization Details

Technical Specifications

Use Cases

Full Model Card (README)