Overview
The ailexleon/Cydonia-24B-v3.1-mlx-fp16 is a 24 billion parameter language model, specifically optimized for use with Apple Silicon via the MLX framework. It is a direct conversion of the TheDrummer/Cydonia-24B-v3.1 model, utilizing mlx-lm version 0.29.1 for the conversion process. This model is designed to provide efficient local inference capabilities on compatible hardware.
Key Characteristics
- Parameter Count: 24 billion parameters, offering a balance between performance and resource requirements.
- Context Length: Supports a substantial context window of 32768 tokens, enabling the processing of longer inputs and generating more coherent, extended outputs.
- MLX Optimization: Converted to the MLX format, ensuring native and optimized performance on Apple Silicon devices.
- Origin: Based on
TheDrummer/Cydonia-24B-v3.1, inheriting its core architecture and capabilities.
Usage
This model is intended for developers and researchers looking to leverage large language models on Apple Silicon. It can be easily integrated into Python projects using the mlx-lm library for loading and inference. The provided code snippets demonstrate how to load the model and generate responses, including handling chat templates if available.
Good For
- Local Inference: Ideal for running LLM tasks directly on Apple Silicon hardware without relying on cloud services.
- Development & Prototyping: Suitable for rapid development and testing of AI applications on macOS.
- Applications Requiring Long Context: Benefits use cases that need to process or generate extensive text, thanks to its 32K context window.