nbeerbower/llama-3-gutenberg-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer0.0K Warm

The nbeerbower/llama-3-gutenberg-8B is an 8 billion parameter language model developed by nbeerbower, based on the Llama-3-8b architecture. It was fine-tuned using Direct Preference Optimization (DPO) on the jondurbin/gutenberg-dpo-v0.1 dataset, which focuses on text quality and style. This model is optimized for generating high-quality, preference-aligned text, making it suitable for creative writing and content generation tasks where stylistic coherence is important.

Loading preview...

Model Overview

nbeerbower/llama-3-gutenberg-8B is an 8 billion parameter language model derived from the Llama-3-8b base model. It was fine-tuned by nbeerbower using Direct Preference Optimization (DPO) on the jondurbin/gutenberg-dpo-v0.1 dataset, which is designed to enhance text quality and alignment with human preferences. The fine-tuning process utilized an A100 GPU on Google Colab, employing LoRA (Low-Rank Adaptation) with specific configurations for r=16, lora_alpha=16, and lora_dropout=0.05.

Key Capabilities

  • Preference-Aligned Text Generation: Optimized through DPO to produce outputs that align with human preferences, likely resulting in more coherent and stylistically consistent text.
  • Llama-3 Architecture: Benefits from the robust base architecture of Llama-3-8b, providing a strong foundation for language understanding and generation.
  • Fine-tuned for Quality: The use of the Gutenberg DPO dataset suggests an emphasis on refining text quality and stylistic nuances.

Performance Metrics

Evaluations on the Open LLM Leaderboard show an Average score of 21.30%. Specific metric scores include:

  • IFEval (0-Shot): 43.72%
  • BBH (3-Shot): 27.96%
  • MMLU-PRO (5-shot): 31.45%

Good For

  • Creative Writing: Its DPO fine-tuning on a text quality dataset makes it suitable for generating creative content, stories, or stylistic prose.
  • Content Generation: Ideal for tasks requiring high-quality, preference-aligned text output.
  • Research and Experimentation: Developers can use this model to explore the effects of DPO fine-tuning on Llama-3 for specific text generation tasks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p