Name: recursal/Finch-MoE-37B-A11B-v0.1-HF API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: recursal

Finch-MoE-37B-A11B-v0.1-HF: A Mixture of Experts RWKV Model

The recursal/Finch-MoE-37B-A11B-v0.1-HF is a Hugging Face compatible implementation of the Flock of Finches Mixture of Experts (MoE) model, developed by Recursal with compute sponsored by TensorWave. This model leverages a 37 billion parameter architecture with an active parameter count of 11 billion, aiming for a balance of performance and efficiency.

Key Capabilities

Mixture of Experts Architecture: Utilizes an MoE design, which can offer improved performance and potentially more efficient inference compared to dense models of similar total parameter count.
Improved General Language Understanding: Demonstrates notable gains across various benchmarks compared to its predecessors, Eagle 7B, Finch 7B, and Finch 14B.
Hugging Face Compatibility: Fully compatible with the Hugging Face transformers library for straightforward integration and deployment.
Multilingual Support: Example usage shows generation in Chinese, indicating potential for multilingual applications.

Good for

General-purpose text generation: Capable of generating detailed responses to various prompts, as shown in the examples.
Research and experimentation with MoE models: Provides a readily available MoE model within the RWKV family for developers to explore.
Applications requiring enhanced reasoning: Shows improvements on benchmarks like ARC-C and Winogrande, suggesting stronger reasoning capabilities.

Performance Highlights

Evaluations show the Flock of Finches 37B-A11B v0.1 model outperforming earlier RWKV models:

ARC C: 48.04% (vs. 39.59% for Eagle 7B)
MMLU: 55.58% (vs. 30.86% for Eagle 7B)
Winogrande: 75.14% (vs. 67.56% for Eagle 7B)

This model is a significant step in the RWKV series, offering a powerful MoE option for a range of language tasks.