MaziyarPanahi/calme-2.4-llama3-70b

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Apr 28, 2024License:llama3Architecture:Transformer0.0K Cold

MaziyarPanahi/calme-2.4-llama3-70b is a 70 billion parameter language model fine-tuned by MaziyarPanahi using DPO on Meta's Llama-3-70B-Instruct base. This model is optimized for instruction following and general conversational tasks, demonstrating strong performance across various benchmarks including MMLU, HellaSwag, and GSM8k. It is designed for applications requiring robust reasoning and accurate responses, leveraging the Llama-3 architecture's capabilities.

Loading preview...

Model Overview

MaziyarPanahi/calme-2.4-llama3-70b is a 70 billion parameter language model developed by MaziyarPanahi. It is a fine-tuned (DPO) version of the meta-llama/Meta-Llama-3-70B-Instruct base model, leveraging the advanced capabilities of the Llama-3 architecture for enhanced instruction following and conversational fluency.

Key Capabilities & Performance

This model demonstrates strong performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. Notable scores include:

  • MMLU (5-Shot): 80.50
  • HellaSwag (10-Shot): 86.03
  • GSM8k (5-Shot): 87.34
  • AI2 Reasoning Challenge (25-Shot): 72.61

These metrics indicate its proficiency in general reasoning, common sense, and mathematical problem-solving. The model utilizes the ChatML prompt template, making it compatible with standard instruction-tuned workflows.

Usage and Availability

Users can integrate this model using the Hugging Face transformers library. Quantized GGUF versions are also available for more efficient deployment on various hardware configurations. The model is suitable for a wide array of text generation tasks, particularly those requiring precise instruction adherence and high-quality output.