AdamLucek/Orpo-Llama-3.2-1B-15k is a 1 billion parameter language model, fine-tuned using the ORPO method on a subset of the mlabonne/orpo-dpo-mix-40k dataset. Based on Meta's Llama-3.2-1B architecture, this model is optimized for general reasoning and conversational tasks. It offers a balance of performance and efficiency, making it suitable for applications requiring a smaller, yet capable, language model.
Loading preview...
Model Overview
AdamLucek/Orpo-Llama-3.2-1B-15k is a 1 billion parameter model derived from meta-llama/Llama-3.2-1B. It has been fine-tuned using the ORPO (Odds Ratio Preference Optimization) method, a technique designed to align language models with human preferences. The training utilized a 15,000-entry subset of the mlabonne/orpo-dpo-mix-40k dataset, specifically chosen for its quality and diversity.
Key Characteristics
- ORPO Fine-tuning: Leverages the ORPO method for improved alignment and performance.
- Efficient Training: Trained for 7 hours on an L4 GPU, demonstrating efficient resource utilization for fine-tuning.
- Base Model: Built upon the robust
meta-llama/Llama-3.2-1Barchitecture.
Performance Benchmarks
Evaluations against AdamLucek/Orpo-Llama-3.2-1B-40k using lm-evaluation-harness show competitive performance across various tasks:
- AGIEval: Achieves 22.14% accuracy (0-Shot Average).
- GPT4ALL: Scores 51.15% accuracy (0-Shot Average).
- TruthfulQA: Demonstrates 42.79% MC2 accuracy.
- MMLU: Reaches 31.22% accuracy (5-Shot Average).
- Winogrande: Attains 61.72% accuracy (0-shot).
- ARC Challenge: Shows 32.94% accuracy (0-shot).
- PIQA: Achieves 75.46% accuracy (0-shot).
Use Cases
This model is suitable for applications requiring a compact yet capable language model for tasks such as:
- General text generation and completion.
- Conversational AI and chatbots.
- Reasoning tasks where a smaller footprint is beneficial.