abideen/AlphaMonarch-daser

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 16, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

AlphaMonarch-daser is a 7 billion parameter language model developed by abideen, fine-tuned using a combination of LaserQlora and Dora techniques. This model is a DPO fine-tuned version of mlabonne/NeuralMonarch-7B, utilizing the argilla/OpenHermes2.5-dpo-binarized-alpha preference dataset. It demonstrates improved performance over AlphaMonarch-dora on the YALL leaderboard, despite being trained on only half of the projections. The model is optimized for general language tasks, leveraging its DPO fine-tuning for enhanced conversational and instruction-following capabilities.

Loading preview...

AlphaMonarch-daser Overview

AlphaMonarch-daser is a 7 billion parameter language model developed by abideen, built upon the foundation of mlabonne/NeuralMonarch-7B. This model incorporates a unique blend of two fine-tuning techniques: LaserQlora and Dora.

Key Characteristics & Training

  • Fine-tuning Method: The model was fine-tuned using Direct Preference Optimization (DPO) on the argilla/OpenHermes2.5-dpo-binarized-alpha preference dataset.
  • Efficiency: Notably, AlphaMonarch-daser achieved better results compared to its predecessor, AlphaMonarch-dora, despite being fine-tuned on only half of the projections.
  • Training Steps: The model underwent 1080 training steps with specific hyperparameters including a learning rate of 5e-07 and a cosine learning rate scheduler.

Performance & Evaluation

AlphaMonarch-daser's performance has been evaluated on prominent leaderboards:

  • YALL Leaderboard: It ranks superior to AlphaMonarch-dora, AlphaMonarch, and AlphaMonarch-laser.
  • OpenLLM Bench: On this benchmark, it performs competitively, ranking above AlphaMonarch-dora but below AlphaMonarch-laser and AlphaMonarch.

This model is suitable for general language generation and instruction-following tasks, benefiting from its DPO fine-tuning on a preference dataset.