Corianas/DPO-miniguanaco-1.5T
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Mar 6, 2024License:apache-2.0Architecture:Transformer Open Weights Warm

Corianas/DPO-miniguanaco-1.5T is a 1.1 billion parameter language model developed by Corianas, fine-tuned using DPO from Corianas/tiny-llama-miniguanaco-1.5T, which itself was based on TinyLlama/TinyLlama-1.1B-intermediate-step-715k-1.5T. This model demonstrates an average performance of 35.13 across various benchmarks, including 54.05 on HellaSwag and 42.69 on TruthfulQA. It is suitable for general language understanding and generation tasks, particularly where a smaller, efficient model is preferred.

Loading preview...

Overview

Corianas/DPO-miniguanaco-1.5T is a 1.1 billion parameter language model, representing a DPO (Direct Preference Optimization) fine-tuned version of the Corianas/tiny-llama-miniguanaco-1.5T model. The base model was originally derived from TinyLlama/TinyLlama-1.1B-intermediate-step-715k-1.5T, indicating a lineage focused on efficient, smaller-scale language processing.

Key Capabilities

  • General Language Understanding: The model is capable of processing and generating human-like text.
  • Reasoning: Achieves 30.63 on the AI2 Reasoning Challenge (25-Shot).
  • Common Sense: Scores 54.05 on HellaSwag (10-Shot) and 58.64 on Winogrande (5-shot), indicating some common sense reasoning ability.
  • Truthfulness: Demonstrates a TruthfulQA (0-shot) score of 42.69.

Performance Metrics

Evaluated on the Open LLM Leaderboard, Corianas/DPO-miniguanaco-1.5T achieved an average score of 35.13. Specific benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 30.63
  • HellaSwag (10-Shot): 54.05
  • MMLU (5-Shot): 24.79
  • TruthfulQA (0-shot): 42.69
  • Winogrande (5-shot): 58.64
  • GSM8k (5-shot): 0.00

Detailed results can be found on the Open LLM Leaderboard.

Good For

  • Applications requiring a compact and efficient language model.
  • Tasks involving general text generation and understanding where a 1.1B parameter model is sufficient.
  • Exploration of DPO fine-tuning effects on smaller models.