dhanushreddy29/BrokenKeyboard
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Jan 12, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

dhanushreddy29/BrokenKeyboard is a 10.7 billion parameter language model, finetuned from upstage/SOLAR-10.7B-Instruct-v1.0. This model was trained using Direct Preference Optimization (DPO) on the argilla/distilabel-intel-orca-dpo-pairs dataset, focusing on improving instruction-following capabilities. It achieves an average score of 74.08 on the Open LLM Leaderboard, demonstrating proficiency across various reasoning and language understanding tasks.

Loading preview...

Model Overview

dhanushreddy29/BrokenKeyboard is a 10.7 billion parameter language model developed by dhanushreddy29. It is a finetuned version of the upstage/SOLAR-10.7B-Instruct-v1.0 base model, specifically optimized using Direct Preference Optimization (DPO).

Training Details

The model was finetuned using the argilla/distilabel-intel-orca-dpo-pairs dataset. The training methodology followed a Google Colab guide for DPO, aiming to enhance the model's ability to follow instructions and generate preferred responses.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, BrokenKeyboard achieved an average score of 74.08. Key performance metrics include:

  • AI2 Reasoning Challenge (25-Shot): 71.25
  • HellaSwag (10-Shot): 88.34
  • MMLU (5-Shot): 66.04
  • TruthfulQA (0-shot): 71.36
  • Winogrande (5-shot): 83.19
  • GSM8k (5-shot): 64.29

These scores indicate its capabilities in reasoning, common sense, language understanding, and mathematical problem-solving.

Use Cases

This model is suitable for tasks requiring robust instruction following and general language generation, particularly where performance on benchmarks like MMLU and HellaSwag is critical. Its DPO finetuning suggests improved alignment with human preferences for generated text.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p