Nexusflow/Athene-V2-Chat

Warm
Public
72.7B
FP8
131072
License: other
Hugging Face
Overview

Athene-V2-Chat-72B Overview

Athene-V2-Chat-72B is a 72.7 billion parameter open-weights large language model developed by Nexusflow, fine-tuned from Qwen 2.5 72B-Instruct. It is designed for chat-based applications and demonstrates strong performance across various benchmarks, rivaling proprietary models like GPT-4o.

Key Capabilities

  • Chat Performance: Achieves performance on par with GPT-4o-0513 in instruction following, longer queries, and multi-turn conversations.
  • Mathematical Reasoning: Excels in hard and mathematical categories, outperforming GPT-4o-0513 on Chatbot Arena.
  • Coding Proficiency: Shows comparable performance to GPT-4o-0513 in coding tasks.
  • Context Length: Features a substantial 131,072 token context window, supporting extensive interactions.
  • RLHF Training: Benefits from Reinforcement Learning from Human Feedback (RLHF) to enhance conversational quality and alignment.

Good For

  • General Chat Applications: Ideal for building highly capable conversational agents.
  • Complex Problem Solving: Suitable for tasks requiring strong mathematical and logical reasoning.
  • Code Generation and Analysis: Effective for programming-related queries and code assistance.
  • Interactive Systems: Its robust instruction following and multi-turn capabilities make it well-suited for dynamic user interactions.