VAGOsolutions FC-SauerkrautLM-7b-beta Overview
FC-SauerkrautLM-7b-beta is a 7 billion parameter function-calling model, a collaborative effort between VAGO solutions and Hyperspace.ai. It is built upon the openchat/openchat-3.5-0106 architecture and has been enhanced with a focus on function calling capabilities.
Key Capabilities and Training Innovations
- Function Calling: The model is specifically trained with a dedicated function calling dataset, making it adept at interpreting user requests and generating appropriate tool calls. Multiple branches are provided for different configurations (e.g., Laser 3, 4, 8, 16 layers) to optimize function calling performance.
- Novel Training Technique (LaserRMT): It utilizes a unique training strategy called LaserRMT, which involves partially freezing the model based on a laser-like analysis. This method aims to optimize the no free lunch theorem, prevent catastrophic forgetting, and improve specific skills like mathematical abilities without degrading overall performance.
- Multilingual Support: The model supports both German and English, with improved German language skills.
- Alignment: Fine-tuned using Supervised Fine-Tuning (SFT) and aligned with Direct Preference Optimization (DPO).
Performance and Use Cases
While function calling is its primary focus, the model also shows competitive general language understanding. The Laser 3 layer variant achieved the best benchmark results among the provided branches, with an average score of 66.82 across various benchmarks including ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K. This model is ideal for developers looking to integrate robust function-calling capabilities into their applications, particularly in German and English contexts, and for those interested in exploring models trained with innovative optimization techniques.