Name: plawanrath/mistral-7b-instruct-v0.3-bf16-mlx-cba API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: plawanrath

Overview

This model, plawanrath/mistral-7b-instruct-v0.3-bf16-mlx-cba, is an MLX-formatted BF16 (uncompressed baseline) version of the mistralai/Mistral-7B-Instruct-v0.3 model. It contains 7.2 billion parameters and is specifically designed for use with Apple Silicon via the MLX framework. This particular artifact is one of 15 models used in the research paper "Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels" by Plawan Kumar Rath and Rahul Maliakkal.

Key Characteristics

Base Model: Mistral-7B-Instruct-v0.3 (Mistral family).
Parameters: 7.2 billion.
Precision: BF16 (uncompressed baseline), serving as a reference for quantization studies.
Format: MLX, optimized for Apple Silicon, allowing direct loading without extra conversion steps.
Research Context: This exact artifact was used to produce inference results in a paper investigating how quantization aggressiveness correlates with emergent stereotypical behavior on fairness-sensitive tasks (BBQ ambiguous questions).

Research Findings Highlighted

The associated paper reveals a "dose-response" relationship between quantization aggressiveness and increased bias. For instance, Q3 quantization showed 6.0–21.1% of BF16-unbiased items becoming biased, while Q8 showed 0.1–0.9%. These bias changes were largely invisible to perplexity shifts (<0.5% at Q8, <3% at Q4), underscoring the importance of considering bias in compressed models for fairness-sensitive applications.

Usage

This model can be loaded and used with the mlx-lm library for generation tasks, as demonstrated in the provided Python and CLI examples.

Overview

Overview

Key Characteristics

Research Findings Highlighted

Usage

Full Model Card (README)