Model Overview
cstr/Spaetzle-v69-7b is a 7 billion parameter language model developed by cstr through a progressive merge process, primarily using 'dare-ties' and 'slerp' methods. This model aims to provide a robust compromise for tasks requiring proficiency in both English and German, focusing on instruction following and reasoning capabilities. It is built upon a diverse set of base models, including abideen/AlphaMonarch-dora, mayflowergmbh/Wiedervereinigung-7b-dpo, and occiglot/occiglot-7b-de-en-instruct, among others, to achieve its bilingual performance.
Key Capabilities & Performance
The model demonstrates a balanced performance across various benchmarks, reflecting its design as a compromise between different criteria such as German language performance, instruction following, and reasoning. It achieves a German EQ Bench score (v2_de) of 62.59 and an English EQ Bench score (v2) of 76.43. On the Open LLM Leaderboard, it shows an average score of 72.87, with notable results like 86.77 on HellaSwag (10-Shot) and 68.76 on GSM8k (5-shot). The model also performs well in the Nous benchmark, with an average score of 58.27%, including 75.84% on GPT4All and 66.15% on TruthfulQA.
Use Cases
This model is particularly well-suited for applications requiring strong bilingual capabilities in English and German. Its balanced performance across various reasoning and language understanding tasks makes it a versatile choice for:
- Bilingual applications: Where content generation or understanding is needed in both English and German.
- Instruction following: For tasks that require precise adherence to given instructions.
- Reasoning tasks: Its benchmark scores indicate solid performance in logical and common-sense reasoning.
- Local tasks: Optimized for scenarios where a compromise between different performance parameters is acceptable and desired.