Overview
Qwen2.5-14B-Gutenberg-1e-Delta Overview
This model, developed by v000000, is a 14.8 billion parameter variant of the Qwen2.5-14B-Instruct architecture. It has undergone specific fine-tuning using Direct Preference Optimization (DPO) for 1.25 epochs on the jondurbin/gutenberg-dpo-v0.1 dataset. This targeted training aims to enhance its capabilities through preference-based learning.
Key Characteristics
- Base Model: Qwen2.5-14B-Instruct
- Parameter Count: 14.8 billion parameters
- Context Length: Supports an extensive context window of 131,072 tokens
- Training Method: Fine-tuned with DPO (Direct Preference Optimization) on a specialized dataset.
Performance Metrics
Evaluations on the Open LLM Leaderboard indicate the following performance:
- Average Score: 32.11
- IFEval (0-Shot): 80.45
- BBH (3-Shot): 48.62
- MMLU-PRO (5-shot): 43.67
Potential Use Cases
Given its DPO fine-tuning on a Gutenberg-derived dataset and large context window, this model could be particularly effective for:
- Tasks requiring nuanced understanding and generation based on extensive text.
- Applications benefiting from preference-aligned outputs.
- Scenarios where a large context is crucial for maintaining coherence and relevance over long interactions or documents.