Spaetzle-v8-7b: A Merged Bilingual Model
cstr/Spaetzle-v8-7b is a 7 billion parameter language model created by merging several existing models, including flemmingmiguel/NeuDist-Ro-7B, johannhartmann/Brezn3, and ResplendentAI/Flora_DPO_7B, based on mayflowergmbh/Wiedervereinigung-7b-dpo-laser. This model aims to provide solid performance in both German and English, with a particular focus on reliable instruction following and reasoning capabilities.
Key Characteristics & Performance
- Bilingual Focus: Designed for adequate performance in both German and English tasks.
- Behavioral Consistency: Prioritizes consistent output and avoids issues like rambling or template intermixing.
- Instruction Following: Shows a preference for instruction following and reasoning over strict German grammatical perfection.
- Evaluation Scores: Achieves an average of 72.27 on the Open LLM Leaderboard, with specific scores like 68.69 on AI2 Reasoning Challenge and 64.60 on MMLU. It also scores 61.04 on EQ-Bench (v2_de) and 78.3 on EQ-Bench (v2_english).
- Context Length: Supports a context length of 4096 tokens.
Use Cases & Considerations
- Good for: Scenarios where robust instruction following and reasoning are critical, and minor imperfections in German grammar or orthography are acceptable.
- Not ideal for: Applications requiring highly precise and grammatically perfect German text generation, where models like DiscoLM might be stronger.
- Configuration: Utilizes the ChatML format for prompts, making it compatible with standard chat interfaces.