Dampfinchen/Llama-3-8B-Ultra-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 29, 2024License:llama3Architecture:Transformer0.0K Warm

Dampfinchen/Llama-3-8B-Ultra-Instruct is an 8 billion parameter merged language model based on the Llama 3 architecture, created by Dampf. This model integrates multiple specialized models to enhance general intelligence, German multi-language support, RAG capabilities, and medical knowledge, while also incorporating uncensored roleplaying functionalities. It is designed as a small, general-purpose model with an 8192-token context length, aiming to preserve Llama Instruct's core intelligence while adding diverse functionalities.

Loading preview...

Llama-3-8B-Ultra-Instruct: A Merged General-Purpose Model

Dampfinchen/Llama-3-8B-Ultra-Instruct is an 8 billion parameter language model developed by Dampf, created using the DARE TIES merge method. It is built upon the Undi95/Meta-Llama-3-8B-Instruct-hf base model and integrates several other specialized models to achieve a broad range of capabilities. The merge strategy uses conservative weight values to maintain the base Llama Instruct's intelligence while introducing new features.

Key Capabilities & Features

  • Enhanced General Intelligence: Combines multiple instruct models to boost overall reasoning and understanding.
  • Improved RAG Capabilities: Integrates jondurbin/bagel-8b-v1.0 to enhance Retrieval Augmented Generation (RAG).
  • Multilingual Support: Includes VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct for German language capabilities.
  • Specialized Knowledge: Incorporates aaditya/OpenBioLLM-Llama3-8B to add knowledge in the medical and biological fields.
  • Roleplaying & Uncensored Content: Features models like Undi95/Llama-3-LewdPlay-8B-evo for high-quality, uncensored roleplaying, though users should be aware of potentially harmful responses.
  • Vision Support: The merge includes components that introduce vision capabilities.

Performance

On the Open LLM Leaderboard, the model achieves an average score of 69.11. Notable scores include 81.63 on HellaSwag (10-Shot), 68.32 on MMLU (5-Shot), and 70.36 on GSM8k (5-Shot).

Good For

  • Applications requiring a versatile 8B model with enhanced general intelligence.
  • Use cases benefiting from improved RAG and German language support.
  • Medical or biological text generation and understanding.
  • Creative writing and roleplaying scenarios, including those requiring uncensored responses (with caution).

This model aims to provide a compact yet powerful solution by selectively integrating diverse functionalities into the Llama 3 8B Instruct framework.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p