AlphaMonarch-laser: A Performance-Optimized 7B Model
AlphaMonarch-laser is a 7 billion parameter language model developed by abideen, built upon the mlabonne/NeuralMonarch-7B base model. It distinguishes itself through a DPO (Direct Preference Optimization) fine-tuning approach, utilizing the argilla/OpenHermes2.5-dpo-binarized-alpha preference dataset. A key innovation is the application of LaserQLoRA, which has enabled this model to achieve superior performance compared to mlabonne/AlphaMonarch-7B, despite being fine-tuned on only half of the projections.
Key Capabilities & Performance
- Leaderboard Topper: AlphaMonarch-laser holds the #1 rank on the Yet Another LLM Leaderboard (YALL), indicating strong overall capabilities.
- Robust Benchmarking: Evaluation results from Nous Benchmark and OpenLLM Benchmark show competitive performance across various tasks, including:
- AGIEVAL: Average 28.41%
- GPT4ALL: Average 76.98% (e.g., ARC-Challenge 66.30%, HellaSwag 69.60%)
- TruthfulQA: Average 70.71% (mc1 63.04%, mc2 78.39%)
- BIGBENCH: Average 55.37%
- OpenLLM Benchmark: Average 73.5% (e.g., GSM8K 66.77%, Winogrande 84.6%)
Training Details
The model was trained for 1080 steps using specific hyperparameters, including a learning rate of 5e-07, a batch size of 1 (with 8 gradient accumulation steps), and an Adam optimizer. The fine-tuning process involved targeting specific q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, and down_proj modules within the model's layers using QLoRA.