Entrit/Mistral-7B-v0.3-trit-uniform-d4

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:May 4, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Entrit/Mistral-7B-v0.3-trit-uniform-d4 is a 7 billion parameter language model based on Mistral-7B-v0.3, developed by Entrit Systems. It features balanced ternary post-training quantization at a depth of d=4, resulting in 6.64 bits per weight. This model is optimized for efficient inference on hardware capable of consuming packed trit formats, offering a highly compressed representation of the original Mistral-7B-v0.3.

Loading preview...

Overview

Entrit/Mistral-7B-v0.3-trit-uniform-d4 is a 7 billion parameter language model derived from mistralai/Mistral-7B-v0.3. Developed by Entrit Systems, this model utilizes a balanced ternary post-training quantization (PTQ) method, as detailed in the paper "Balanced Ternary Post-Training Quantization for Large Language Models" (Stentzel, 2026).

Key Quantization Details

This model employs a depth of d=4, meaning each weight has 81 levels, achieving an information content of 6.64 bits per weight. The quantization process uses a uniform PTQ method with a group size of 16 and a 27-entry log-spaced scale codebook. While the weights are dequantized to FP16 for compatibility with standard transformers libraries, the true efficiency gain is realized on hardware that can directly process the packed trit format, leveraging the significantly reduced information content.

Technical Specifications

  • Source Model: mistralai/Mistral-7B-v0.3
  • Quantization Depth: d=4 (81 levels per weight)
  • Bits per Weight: 6.64
  • Quantized Layers: All 2D linear matrices
  • FP16 Kept: lm_head, token embeddings, and all *_norm layers
  • Codec: tritllm v2

Use Cases

This model is particularly suitable for applications requiring highly efficient inference where memory footprint and computational cost are critical, especially when deployed on specialized hardware designed to consume packed trit formats. It offers a significantly compressed version of the Mistral-7B-v0.3 model while maintaining its core capabilities.