RWKV/v5-EagleX-v2-7B-HF

Warm
Public
7B
FP8
16384
Apr 17, 2024
License: apache-2.0
Hugging Face
Overview

RWKV EagleX 7B v2: Hugging Face Transformers Implementation

RWKV/v5-EagleX-v2-7B-HF is the Hugging Face Transformers implementation of the RWKV EagleX 7B v2 model, a 7.52 billion parameter causal language model. This version has been trained on an extensive 2.25 trillion tokens, representing a significant progression from its predecessors (Eagle-7B-HF at 1.1T tokens and EagleX-7B-HF-v1 at 1.7T tokens).

Key Capabilities & Performance

This model demonstrates enhanced performance across various benchmarks compared to earlier EagleX iterations, with its avg_acc improving from 0.4822 to 0.5495. Notably, it shows strong results in:

  • General Language Understanding: Achieving 0.7439 on GLUE and 0.7884 on MNLI, outperforming several other 7B class models like OLMo-7B and Llama-2-7b-hf in these specific metrics.
  • Reasoning: Improved scores on MMLU (0.438) and Winogrande (0.7332).
  • Text Generation: Competitive performance on Lambada benchmarks.

Differentiators & Use Cases

Unlike many instruction-tuned models, this is a base model, making it suitable for tasks requiring foundational language understanding and generation. Its competitive benchmark scores against other 7B models, particularly in general language understanding, suggest its utility as a robust base for further fine-tuning or for applications where a strong, general-purpose language model is needed. The model is designed for easy integration with the Hugging Face Transformers library, supporting both CPU and GPU inference, including batch processing.

For more detailed evaluation data, refer to the full eval data spreadsheet and the blog article detailing its launch and performance.