Overview
Lyra4-Gutenberg2-12B: An ORPO Fine-tuned 12B Model
Lyra4-Gutenberg2-12B is a 12 billion parameter language model developed by nbeerbower, building upon the Sao10K/MN-12B-Lyra-v4 base. This iteration features an increased sequence length compared to its predecessor, Lyra4-Gutenberg-12B, enhancing its ability to process longer contexts up to 32768 tokens.
Key Capabilities & Training:
- Base Model: Fine-tuned from Sao10K/MN-12B-Lyra-v4.
- Fine-tuning Method: Utilizes the ORPO (Odds Ratio Preference Optimization) technique, trained for 3 epochs.
- Training Data: Leveraged a combination of jondurbin/gutenberg-dpo-v0.1 and nbeerbower/gutenberg2-dpo datasets, with training data formatted using ChatML.
Performance Insights:
Evaluations on the Open LLM Leaderboard show an average score of 19.74. Specific metrics include:
- IFEval (0-Shot): 25.85
- BBH (3-Shot): 33.73
- MMLU-PRO (5-shot): 28.51
This model is suitable for applications requiring a balance of parameter size and extended context handling, particularly where DPO-style alignment is beneficial.