Coven 7B 128K ORPO Alpha: Enhanced Mistral-7B for Extended Context and Preference Optimization
Coven 7B 128K ORPO Alpha is a 7 billion parameter language model developed by raidhon, building upon the Mistral-7B-Instruct-v0.2 base. This iteration significantly enhances the original model's capabilities, primarily through an expanded context window and an innovative fine-tuning approach.
Key Capabilities & Differentiators
- Extended Context Window: Utilizes the Yarn technique to achieve a substantial 128K token context length, allowing for processing and understanding of much larger and more complex documents or conversations.
- ORPO Fine-tuning: Incorporates ORPO (Monolithic Preference Optimization without Reference Model) technology. This method streamlines preference alignment by directly optimizing the odds ratio, improving model performance without needing a separate reference model.
- Improved Performance: Demonstrates notable gains across several benchmarks compared to its base model, Mistral-7B-Instruct-v0.2:
- GSM8K (Strict/Flexible): Achieves a significant +73.65% / +73.29% increase in exact match accuracy, indicating enhanced mathematical and reasoning abilities.
- MMLU: Shows a +7.16% improvement in accuracy, suggesting better general knowledge and understanding.
- Winogrande, PIQA, BoolQ, ARC Easy, ARC Challenge: Exhibits positive gains in accuracy, highlighting improved common sense reasoning and question answering.
Ideal Use Cases
- Applications requiring processing and understanding of very long texts, such as legal documents, extensive reports, or prolonged conversational histories.
- Tasks benefiting from improved mathematical reasoning and general knowledge, as evidenced by its benchmark performance.
- Scenarios where a 7B parameter model with enhanced preference alignment and extended context is desired for efficient deployment.