Model Overview
invalid-coder/dolphin-2.1-mistral-7b-snr-laser is a 7 billion parameter language model built upon the MistralAI architecture. It incorporates a unique training methodology, laserRMT, which partially freezes the model to mitigate catastrophic forgetting, a critical issue when imparting new, specific skills such as function calling. This model is uncensored and designed to be highly compliant with user requests, making it versatile for various applications.
Key Capabilities & Features
- Catastrophic Forgetting Prevention: Employs
laserRMT to retain previously learned knowledge while acquiring new skills. - Uncensored & Compliant: Trained with a filtered dataset to remove alignment and bias, resulting in a highly compliant model. Users are advised to implement their own alignment layers for service deployment.
- Commercial Use: Licensed under Apache-2.0, allowing for both commercial and non-commercial applications.
- ChatML Prompt Format: Utilizes the ChatML format for consistent and effective interaction.
Training & Dataset
The model was trained for 4 epochs over 48 hours on 4x A100 GPUs. Its dataset is a modified version of Dolphin, an open-source implementation of Microsoft's Orca, enhanced with Jon Durbin's Airoboros dataset to boost creativity. This training approach aims to provide a robust and adaptable language model.
Performance Highlights
Evaluations on the Open LLM Leaderboard show competitive performance for its size:
- Avg.: 53.47
- MMLU (5-shot): 63.32
- HellaSwag (10-shot): 84.92
This model is well-suited for developers seeking a flexible, uncensored LLM with enhanced skill retention capabilities, particularly for tasks requiring precise instruction following.