ResplendentAI/Flora_DPO_7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Mar 7, 2024License:cc-by-sa-4.0Architecture:Transformer0.0K Open Weights Cold

ResplendentAI/Flora_DPO_7B is a 7 billion parameter language model developed by ResplendentAI, fine-tuned using a Direct Preference Optimization (DPO) dataset. This model achieves an average score of 74.26 on the Open LLM Leaderboard, demonstrating strong performance across various benchmarks including HellaSwag (88.28) and Winogrande (84.53). It is particularly suited for general language understanding and generation tasks where DPO-tuned responses are beneficial.

Loading preview...

Flora DPO: A 7B Parameter DPO-Tuned Model

ResplendentAI's Flora_DPO_7B is a 7 billion parameter language model that has been fine-tuned using a specific Direct Preference Optimization (DPO) dataset, mlabonne/chatml_dpo_pairs. This tuning process aims to align the model's outputs more closely with human preferences, enhancing its conversational and response generation quality.

Key Capabilities & Performance

Evaluated on the Open LLM Leaderboard, Flora_DPO_7B demonstrates solid performance across a range of benchmarks, achieving an average score of 74.26.

  • AI2 Reasoning Challenge (25-Shot): 71.76
  • HellaSwag (10-Shot): 88.28
  • MMLU (5-Shot): 64.13
  • TruthfulQA (0-shot): 71.08
  • Winogrande (5-shot): 84.53
  • GSM8k (5-shot): 65.81

These scores indicate its proficiency in common sense reasoning, language understanding, and question-answering tasks. The model's DPO fine-tuning makes it particularly effective for applications requiring nuanced and preferred responses.

Quantized Versions Available

For optimized deployment and reduced resource consumption, quantized versions of Flora_DPO_7B are available, including AWQ and EXL2 formats, provided by community contributors.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p