VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jul 25, 2024License:llama3.1Architecture:Transformer0.0K Warm

VAGO solutions' Llama-3.1-SauerkrautLM-8b-Instruct is an 8 billion parameter instruction-tuned model, fine-tuned from Meta-Llama-3.1-8B-Instruct. It specializes in German and English language tasks, leveraging a unique German-English Sauerkraut Mix v2 dataset and Spectrum Fine-Tuning targeting 25% of layers. This model demonstrates resource-efficient enhancement of capabilities in both languages while preserving existing knowledge.

Loading preview...

VAGO solutions Llama-3.1-SauerkrautLM-8b-Instruct Overview

VAGO solutions presents Llama-3.1-SauerkrautLM-8b-Instruct, an 8 billion parameter instruction-tuned model derived from Meta's Llama-3.1-8B-Instruct. This model showcases the effectiveness of Spectrum Fine-Tuning, a resource-efficient method that targets only 25% of the model's layers to enhance specific capabilities.

Key Characteristics & Training

  • Bilingual Focus: Fine-tuned specifically on a proprietary "German-English Sauerkraut Mix v2" dataset, emphasizing high-quality German and English data, including synthetic datasets.
  • Resource Efficiency: The primary objective was to demonstrate significant capability enhancement using a fraction of the resources typically required for fine-tuning, achieved through Spectrum Fine-Tuning.
  • Capability Preservation: This fine-tuning approach aims to improve performance in target languages (German and English) while largely preserving the foundational knowledge acquired by the base Llama-3.1 model.

Performance & Use Cases

The model has shown improved skills in German and English, with VAGO solutions highlighting impressive benchmarks on the Hugging Face leaderboard. This makes it suitable for applications requiring strong performance in both German and English, particularly where resource efficiency during fine-tuning is a critical factor. Evaluation results are presented across AGIEVAL, GPT4ALL, TRUTHFULQA, and OPENLEADERBOARD 2 benchmarks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p