meshllm/gemma-3-1b-it-parity-bf16-mlx

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Apr 6, 2026License:gemmaArchitecture:Transformer Warm

The meshllm/gemma-3-1b-it-parity-bf16-mlx model is a 1 billion parameter instruction-tuned Gemma variant, converted by meshllm to MLX bf16 format. It is specifically designed for backend parity testing within the mesh-llm framework, ensuring consistent behavior with its GGUF counterpart. This model is optimized for direct comparison and validation of MLX versus GGUF inference, rather than general community use. It maintains a 32768 token context length from the original Google Gemma model.

Loading preview...

Overview

This model, meshllm/gemma-3-1b-it-parity-bf16-mlx, is a 1 billion parameter instruction-tuned Gemma variant, converted to the MLX bf16 format by meshllm. Its primary purpose is to serve as a same-origin MLX bf16 conversion of google/gemma-3-1b-it for backend parity testing within the mesh-llm framework.

Key Characteristics

  • Parity Testing Focus: Specifically created to be paired with meshllm/gemma-3-1b-it-parity-f16-gguf for validating MLX versus GGUF behavior.
  • Source: Converted directly from the original google/gemma-3-1b-it Hugging Face checkpoint.
  • Validation: Rigorously validated against its GGUF counterpart using the mesh-llm exact prompt suite, demonstrating exact output matches across a range of prompts (e.g., primary, alt-green, capital-france, two-plus-two).
  • Format: Provided in MLX bf16 format, including model.safetensors and associated tokenizer/config files.

Intended Use

This model is not positioned as a general-purpose community artifact. Instead, it is a specialized tool for developers working with mesh-llm who require a clean, apples-to-apples comparison for reproducible parity testing between different inference backends (MLX and GGUF).