mitkox/phi-2-super-OpenHermes-2.5-moe-mlx

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3BQuant:BF16Ctx Length:2kPublished:Mar 2, 2024License:mitArchitecture:Transformer0.0K Open Weights Warm

The mitkox/phi-2-super-OpenHermes-2.5-moe-mlx is a 3 billion parameter Mixture-of-Experts (MoE) language model, created by merging 'abacaj/phi-2-super' with 'g-ronimo/phi-2-OpenHermes-2.5'. This model is specifically designed for efficient inference on Apple Silicon via the MLX framework, offering a compact yet capable solution for general language generation tasks. Its MoE architecture aims to combine the strengths of its constituent models, providing a versatile base for various applications.

Loading preview...

Model Overview

The mitkox/phi-2-super-OpenHermes-2.5-moe-mlx is a 3 billion parameter Mixture-of-Experts (MoE) language model. It was created by merging two distinct models: abacaj/phi-2-super and g-ronimo/phi-2-OpenHermes-2.5. This merging strategy aims to leverage the combined strengths of its base models, offering a versatile and efficient language generation capability.

Key Characteristics

  • Mixture-of-Experts (MoE) Architecture: Combines the knowledge and capabilities of two pre-existing Phi-2 based models.
  • Parameter Count: Features 3 billion parameters, providing a balance between performance and computational efficiency.
  • MLX Framework Compatibility: Optimized for use with Apple Silicon via the MLX framework, enabling efficient local inference.
  • Context Length: Supports a context window of 2048 tokens, suitable for a range of conversational and text generation tasks.

Use Cases

This model is particularly well-suited for developers and researchers working with Apple Silicon hardware who require a capable yet resource-efficient language model. Its MoE design suggests potential for robust performance across various general-purpose language tasks, including:

  • Text generation and completion
  • Conversational AI
  • Prototyping on local machines with MLX support