Name: usermma/nemo-crownelius-st-12b-mlx-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: usermma

Model Overview

The usermma/nemo-crownelius-st-12b-mlx-fp16 is a 12 billion parameter language model, specifically converted to the MLX format. This conversion was performed from the original ewald1976/nemo-crownelius-st-12b model using mlx-lm version 0.31.2.

Key Characteristics

MLX Format: Optimized for efficient inference on Apple Silicon (Macs with M-series chips).
Parameter Count: Features 12 billion parameters, offering a balance between performance and computational requirements for local deployment.
FP16 Precision: Utilizes FP16 (half-precision floating-point) for reduced memory footprint and potentially faster inference on compatible hardware.

Usage

This model is intended for developers and researchers who wish to run a 12B parameter language model locally on Apple Silicon devices. It integrates seamlessly with the mlx-lm library, providing a straightforward method for loading and generating text. The provided code examples demonstrate how to load the model and tokenizer, apply chat templates if available, and generate responses.

Good For

Local Inference: Ideal for running a capable language model directly on Apple Silicon hardware without cloud dependencies.
MLX Ecosystem Development: Suitable for projects and applications built within the MLX framework.
Experimentation: Provides a substantial model for local experimentation and development of LLM-powered features.

Overview

Model Overview

Key Characteristics

Usage

Good For

Full Model Card (README)