cstr/Spaetzle-v69-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kLicense:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

cstr/Spaetzle-v69-7b is a 7 billion parameter language model, created by cstr, resulting from a progressive merge of multiple models. It is designed to offer a balanced performance for both English and German language tasks, demonstrating capabilities in instruction following and reasoning. The model achieves competitive benchmark scores, making it suitable for local tasks requiring strong bilingual understanding.

Loading preview...

Model Overview

cstr/Spaetzle-v69-7b is a 7 billion parameter language model developed by cstr through a progressive merge process, primarily using 'dare-ties' and 'slerp' methods. This model aims to provide a robust compromise for tasks requiring proficiency in both English and German, focusing on instruction following and reasoning capabilities. It is built upon a diverse set of base models, including abideen/AlphaMonarch-dora, mayflowergmbh/Wiedervereinigung-7b-dpo, and occiglot/occiglot-7b-de-en-instruct, among others, to achieve its bilingual performance.

Key Capabilities & Performance

The model demonstrates a balanced performance across various benchmarks, reflecting its design as a compromise between different criteria such as German language performance, instruction following, and reasoning. It achieves a German EQ Bench score (v2_de) of 62.59 and an English EQ Bench score (v2) of 76.43. On the Open LLM Leaderboard, it shows an average score of 72.87, with notable results like 86.77 on HellaSwag (10-Shot) and 68.76 on GSM8k (5-shot). The model also performs well in the Nous benchmark, with an average score of 58.27%, including 75.84% on GPT4All and 66.15% on TruthfulQA.

Use Cases

This model is particularly well-suited for applications requiring strong bilingual capabilities in English and German. Its balanced performance across various reasoning and language understanding tasks makes it a versatile choice for:

  • Bilingual applications: Where content generation or understanding is needed in both English and German.
  • Instruction following: For tasks that require precise adherence to given instructions.
  • Reasoning tasks: Its benchmark scores indicate solid performance in logical and common-sense reasoning.
  • Local tasks: Optimized for scenarios where a compromise between different performance parameters is acceptable and desired.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p