AdamLucek/Orpo-Llama-3.2-1B-15k
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Oct 30, 2024License:mitArchitecture:Transformer Open Weights Warm

AdamLucek/Orpo-Llama-3.2-1B-15k is a 1 billion parameter language model, fine-tuned using the ORPO method on a subset of the mlabonne/orpo-dpo-mix-40k dataset. Based on Meta's Llama-3.2-1B architecture, this model is optimized for general reasoning and conversational tasks. It offers a balance of performance and efficiency, making it suitable for applications requiring a smaller, yet capable, language model.

Loading preview...

Model Overview

AdamLucek/Orpo-Llama-3.2-1B-15k is a 1 billion parameter model derived from meta-llama/Llama-3.2-1B. It has been fine-tuned using the ORPO (Odds Ratio Preference Optimization) method, a technique designed to align language models with human preferences. The training utilized a 15,000-entry subset of the mlabonne/orpo-dpo-mix-40k dataset, specifically chosen for its quality and diversity.

Key Characteristics

  • ORPO Fine-tuning: Leverages the ORPO method for improved alignment and performance.
  • Efficient Training: Trained for 7 hours on an L4 GPU, demonstrating efficient resource utilization for fine-tuning.
  • Base Model: Built upon the robust meta-llama/Llama-3.2-1B architecture.

Performance Benchmarks

Evaluations against AdamLucek/Orpo-Llama-3.2-1B-40k using lm-evaluation-harness show competitive performance across various tasks:

  • AGIEval: Achieves 22.14% accuracy (0-Shot Average).
  • GPT4ALL: Scores 51.15% accuracy (0-Shot Average).
  • TruthfulQA: Demonstrates 42.79% MC2 accuracy.
  • MMLU: Reaches 31.22% accuracy (5-Shot Average).
  • Winogrande: Attains 61.72% accuracy (0-shot).
  • ARC Challenge: Shows 32.94% accuracy (0-shot).
  • PIQA: Achieves 75.46% accuracy (0-shot).

Use Cases

This model is suitable for applications requiring a compact yet capable language model for tasks such as:

  • General text generation and completion.
  • Conversational AI and chatbots.
  • Reasoning tasks where a smaller footprint is beneficial.