VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Jul 22, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct is a 12 billion parameter instruction-tuned model developed by VAGO solutions, fine-tuned from mistralai/Mistral-Nemo-Instruct-2407. It utilizes Spectrum Fine-Tuning on 25% of its layers with a unique German-English Sauerkraut Mix v2 dataset. This model demonstrates resource-efficient fine-tuning for enhanced German and English language capabilities, while also improving performance across other languages.

Loading preview...

VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct Overview

VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct is a 12 billion parameter instruction-tuned model developed by VAGO solutions. It is a fine-tuned version of mistralai/Mistral-Nemo-Instruct-2407, specifically optimized for resource-efficient enhancement of language capabilities.

Key Capabilities and Features

  • Resource-Efficient Fine-Tuning: Leverages Spectrum Fine-Tuning, targeting only 25% of the model's layers to significantly improve capabilities with reduced computational resources compared to traditional fine-tuning.
  • Multilingual Enhancement: Primarily fine-tuned on a proprietary German-English "Sauerkraut Mix v2" dataset, leading to substantially improved skills in both German and English. The fine-tuning process also demonstrates inter-language effects, enhancing performance in other languages the base Nemo model supports.
  • High-Quality Data: The "Sauerkraut Mix v2" dataset consists of meticulously selected, high-quality combinations and cutting-edge synthetic datasets.
  • Demonstration of Spectrum Fine-Tuning: The model serves as a showcase for the effectiveness of Spectrum Fine-Tuning in enhancing large language models while preserving the majority of their previously acquired knowledge.

Use Cases and Strengths

  • Multilingual Applications: Ideal for applications requiring strong performance in German and English, with benefits extending to other languages.
  • Resource-Constrained Environments: Suitable for developers looking to fine-tune large language models efficiently without extensive computational resources.
  • Research and Development: Provides a practical example of advanced fine-tuning techniques for LLMs.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p