usermma/nemo-crownelius-st-12b-mlx-fp16

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 14, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The usermma/nemo-crownelius-st-12b-mlx-fp16 is a 12 billion parameter language model, converted to the MLX format from the ewald1976/nemo-crownelius-st-12b model. This model is specifically designed for efficient deployment and inference on Apple Silicon, leveraging the MLX framework. It provides a readily usable version for developers working within the MLX ecosystem, enabling local execution of a substantial language model.

Loading preview...

Model Overview

The usermma/nemo-crownelius-st-12b-mlx-fp16 is a 12 billion parameter language model, specifically converted to the MLX format. This conversion was performed from the original ewald1976/nemo-crownelius-st-12b model using mlx-lm version 0.31.2.

Key Characteristics

  • MLX Format: Optimized for efficient inference on Apple Silicon (Macs with M-series chips).
  • Parameter Count: Features 12 billion parameters, offering a balance between performance and computational requirements for local deployment.
  • FP16 Precision: Utilizes FP16 (half-precision floating-point) for reduced memory footprint and potentially faster inference on compatible hardware.

Usage

This model is intended for developers and researchers who wish to run a 12B parameter language model locally on Apple Silicon devices. It integrates seamlessly with the mlx-lm library, providing a straightforward method for loading and generating text. The provided code examples demonstrate how to load the model and tokenizer, apply chat templates if available, and generate responses.

Good For

  • Local Inference: Ideal for running a capable language model directly on Apple Silicon hardware without cloud dependencies.
  • MLX Ecosystem Development: Suitable for projects and applications built within the MLX framework.
  • Experimentation: Provides a substantial model for local experimentation and development of LLM-powered features.