gbueno86/Meta-LLama-3-Cat-Smaug-LLama-70b
The gbueno86/Meta-LLama-3-Cat-Smaug-LLama-70b is a 70 billion parameter language model, a merge of Meta-LLama-3-Cat-A-LLama-70b and abacusai_Smaug-Llama-3-70B-Instruct using the SLERP method. This model is designed for general language understanding and generation tasks, demonstrating capabilities in logical reasoning, problem-solving, and creative text generation, as evidenced by its performance on various benchmarks.
Loading preview...
Model Overview
This model, gbueno86/Meta-LLama-3-Cat-Smaug-LLama-70b, is a 70 billion parameter language model created by merging two pre-trained models: Meta-LLama-3-Cat-A-LLama-70b and abacusai_Smaug-Llama-3-70B-Instruct. The merge was performed using the SLERP (Spherical Linear Interpolation) method, combining their strengths to enhance overall performance. It supports a context length of 8192 tokens.
Key Capabilities
- Logical Reasoning: Demonstrates step-by-step reasoning for complex problems, such as the "ball in the microwave" and "killers in a room" scenarios.
- Problem Solving: Capable of breaking down multi-step problems, like calculating ways to open doors and windows for airflow.
- Code Generation: Can generate functional code, as shown by the Pygame "Snake" game example.
- Creative Text Generation: Able to produce creative content, including poems and horror stories, while following specific thematic instructions.
- Instruction Following: Accurately processes and responds to diverse user prompts, including JSON generation requests.
Performance Highlights
Evaluations on the Open LLM Leaderboard indicate a strong average performance of 38.27. Notable scores include:
- IFEval (0-Shot): 80.72
- BBH (3-Shot): 51.51
- MMLU-PRO (5-shot): 45.28
Good For
This model is suitable for applications requiring robust general-purpose language understanding, logical deduction, and creative content generation. Its merged architecture aims to leverage the strengths of its constituent models, making it a versatile choice for tasks ranging from complex reasoning to interactive conversational agents.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.