Undi95/Amethyst-13B
Undi95/Amethyst-13B is a 13 billion parameter language model developed by Undi95, built upon Xwin-LM/Xwin-LM-13B-V0.1 and other merged models. Utilizing a BlockMerge_Gradient approach, it incorporates elements from Huginn-13b-FP16, 120-Days-of-LORA-v2-13B, and LimaRP-Llama2-13B-v3-EXPERIMENT. This model is instruction-tuned using the Alpaca prompt format and achieves an average score of 51.2 on the Open LLM Leaderboard, with a 4096-token context length.
Loading preview...
Amethyst-13B: A Merged 13B Language Model
Amethyst-13B is a 13 billion parameter language model developed by Undi95, created through a BlockMerge_Gradient technique. This model integrates several base models and LoRAs, including Xwin-LM/Xwin-LM-13B-V0.1, The-Face-Of-Goonery/Huginn-13b-FP16, zattio770/120-Days-of-LORA-v2-13B, and notably, lemonilia/LimaRP-Llama2-13B-v3-EXPERIMENT.
Key Characteristics & Performance
- Architecture: A merge of multiple 13B models and LoRAs, aiming for improved results through gradient-based merging.
- Instruction Format: Utilizes the Alpaca prompt template for instruction-following tasks.
- Context Length: Supports a context window of 4096 tokens.
- Leaderboard Performance: Achieves an average score of 51.2 on the Open LLM Leaderboard. Specific scores include:
- ARC (25-shot): 62.63
- HellaSwag (10-shot): 83.17
- MMLU (5-shot): 55.91
- TruthfulQA (0-shot): 52.43
- Winogrande (5-shot): 74.74
- GSM8K (5-shot): 10.84
- DROP (3-shot): 18.7
Unique Aspects
The model's development specifically highlights the use of LimaRP v3, suggesting potential optimizations for role-playing or conversational applications, with recommended settings provided for platforms like SillyTavern. This focus on merging and specific fine-tuning components differentiates it from base models.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.