Muhammad2003/Llama3-8B-OpenHermes-DPO
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 18, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Muhammad2003/Llama3-8B-OpenHermes-DPO is an 8 billion parameter language model, DPO-finetuned from Meta's Llama 3 architecture. This model is specifically optimized using the OpenHermes-2.5 preference dataset via QLoRA for improved instruction following and conversational capabilities. It is designed for general-purpose text generation and chat applications, leveraging its fine-tuning for enhanced response quality.

Loading preview...

Model Overview

Muhammad2003/Llama3-8B-OpenHermes-DPO is an 8 billion parameter language model, fine-tuned from the robust Meta-Llama-3-8B base model. This model distinguishes itself through its DPO (Direct Preference Optimization) finetuning, which was performed using the OpenHermes-2.5 preference dataset and the QLoRA method. This optimization process aims to align the model's outputs more closely with human preferences, enhancing its conversational quality and instruction-following abilities.

Key Capabilities

  • Enhanced Instruction Following: Benefits from DPO finetuning on a preference dataset, leading to more aligned and helpful responses.
  • Conversational AI: Optimized for chat-based interactions and generating coherent, contextually relevant dialogue.
  • General Text Generation: Capable of a wide range of text generation tasks, leveraging the strong foundation of the Llama 3 architecture.

Good For

  • Developing chatbots and conversational agents.
  • Applications requiring models that adhere well to user instructions.
  • General-purpose text generation where response quality and alignment are important.

Note: Evaluation results are currently pending and will be released soon.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p