Undi95/Amethyst-13B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Sep 24, 2023License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

Undi95/Amethyst-13B is a 13 billion parameter language model developed by Undi95, built upon Xwin-LM/Xwin-LM-13B-V0.1 and other merged models. Utilizing a BlockMerge_Gradient approach, it incorporates elements from Huginn-13b-FP16, 120-Days-of-LORA-v2-13B, and LimaRP-Llama2-13B-v3-EXPERIMENT. This model is instruction-tuned using the Alpaca prompt format and achieves an average score of 51.2 on the Open LLM Leaderboard, with a 4096-token context length.

Loading preview...

Amethyst-13B: A Merged 13B Language Model

Amethyst-13B is a 13 billion parameter language model developed by Undi95, created through a BlockMerge_Gradient technique. This model integrates several base models and LoRAs, including Xwin-LM/Xwin-LM-13B-V0.1, The-Face-Of-Goonery/Huginn-13b-FP16, zattio770/120-Days-of-LORA-v2-13B, and notably, lemonilia/LimaRP-Llama2-13B-v3-EXPERIMENT.

Key Characteristics & Performance

  • Architecture: A merge of multiple 13B models and LoRAs, aiming for improved results through gradient-based merging.
  • Instruction Format: Utilizes the Alpaca prompt template for instruction-following tasks.
  • Context Length: Supports a context window of 4096 tokens.
  • Leaderboard Performance: Achieves an average score of 51.2 on the Open LLM Leaderboard. Specific scores include:
    • ARC (25-shot): 62.63
    • HellaSwag (10-shot): 83.17
    • MMLU (5-shot): 55.91
    • TruthfulQA (0-shot): 52.43
    • Winogrande (5-shot): 74.74
    • GSM8K (5-shot): 10.84
    • DROP (3-shot): 18.7

Unique Aspects

The model's development specifically highlights the use of LimaRP v3, suggesting potential optimizations for role-playing or conversational applications, with recommended settings provided for platforms like SillyTavern. This focus on merging and specific fine-tuning components differentiates it from base models.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p