RWKV/v5-Eagle-7B-pth is a 7.52 billion parameter model built on the RWKV-v5 architecture, a linear transformer designed for significantly lower inference costs. Trained on 1.1 trillion tokens across over 100 languages, it excels in multi-lingual benchmarks, outperforming other 7B class models. This foundation model approaches the performance of larger transformer models like Falcon and LLaMA2 in English evaluations while being an "Attention-Free Transformer."
No reviews yet. Be the first to review!