aari1995/germeo-7b-laser

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 9, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The aari1995/germeo-7b-laser is a 7 billion parameter causal decoder-only transformer language model developed by aari1995, merged from leo-mistral-hessianai-7b-chat and DPOpenHermes-7B-v2. This model is specifically designed for German-only speaking with strong English understanding capabilities, leveraging "laser" data for improved language comprehension. It excels in German language tasks while maintaining competitive English benchmark performance, making it suitable for applications requiring robust German generation and English comprehension.

Loading preview...

Germeo-7B-Laser: German-Focused Multilingual Model

The germeo-7b-laser is a 7 billion parameter language model developed by aari1995, built by merging leo-mistral-hessianai-7b-chat and DPOpenHermes-7B-v2. Its core innovation lies in the integration of "laser" data (specifically LeoLM/OpenSchnabeltier), which aims to enhance language understanding while prioritizing German replies.

Key Capabilities & Features

  • German-Centric Generation: The model is hypothesized to increase the probability of German replies and boost internal German capabilities, making it highly suitable for German language generation tasks.
  • English Understanding: Despite its German output focus, the model retains strong English understanding capabilities, as indicated by its competitive performance on English benchmarks.
  • Merged Architecture: Combines strengths from established models like leo-mistral-hessianai-7b-chat and DPOpenHermes-7B-v2 to achieve its specialized multilingual profile.
  • Prompt Format: Utilizes a specific system, user, assistant prompt template for optimal interaction, with custom stopping criteria provided for reply-only generation.

Performance Highlights

While specific benchmark results for germeo-7b-laser on German tasks (MMLU-DE, Hellaswag-DE, ARC-DE) are marked as "?" in the README, the model's design suggests an improvement in German language processing. On English benchmarks, it shows strong performance, with an average score of 67.9 across MMLU, Hellaswag, and ARC, and an overall Open LLM Leaderboard average of 62.82.

Good For

  • Applications requiring robust German text generation.
  • Use cases where English input needs to be understood, but the output should primarily be in German.
  • Developers experimenting with multilingual models focused on specific language output while maintaining broad understanding.