trast-ai/Nemotron-Orchestrator-8B-MLX

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Nov 29, 2025Architecture:Transformer0.0K Warm

The trast-ai/Nemotron-Orchestrator-8B-MLX is an 8 billion parameter language model, converted to the MLX format from NVIDIA's Orchestrator-8B. This model is designed for efficient deployment and inference on Apple Silicon, leveraging the MLX framework. It maintains the core capabilities of the original Orchestrator-8B, making it suitable for general-purpose language generation and understanding tasks within the MLX ecosystem.

Loading preview...

trast-ai/Nemotron-Orchestrator-8B-MLX Overview

This model is a specialized version of NVIDIA's Orchestrator-8B, specifically converted to the MLX format by trast-ai. With 8 billion parameters and a context length of 32768 tokens, it is optimized for efficient execution on Apple Silicon devices using the mlx-lm library.

Key Characteristics

  • MLX Conversion: Directly usable with the mlx-lm framework, ensuring compatibility and performance on Apple hardware.
  • Base Model: Derived from nvidia/Orchestrator-8B, inheriting its general language understanding and generation capabilities.
  • Ease of Use: Provides straightforward integration for developers working within the MLX ecosystem, with clear instructions for loading and generating text.

Use Cases

This model is particularly well-suited for:

  • Local Inference: Running large language model tasks directly on Apple Silicon devices.
  • MLX-based Applications: Developing and deploying applications that leverage the MLX framework for language processing.
  • General Language Tasks: Performing tasks such as text generation, summarization, and question answering, benefiting from the efficiency of the MLX conversion.