grimjim/llama-3-Nephilim-v3-8B
grimjim/llama-3-Nephilim-v3-8B is an 8 billion parameter language model, merged using the task arithmetic method from a Meta Llama 3 base and tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1. This model is designed to be creative, with promptsteering employed to vary text generation output and mitigate common Llama 3 8B failings. It is particularly effective for creative text generation and can be used for roleplay, despite not being explicitly trained for it.
Loading preview...
Model Overview
grimjim/llama-3-Nephilim-v3-8B is an 8 billion parameter language model created by grimjim through a merge of pre-trained models using the task arithmetic method. Built upon a Meta Llama 3 base, it integrates tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1 to enhance its capabilities.
Key Characteristics
- Creative Text Generation: The model is noted for its creative output, with promptsteering implemented to diversify text generation and address common issues observed in Llama 3 8B models.
- Roleplay Capability: Although not specifically trained for roleplay, the model can be effectively utilized for such applications.
- Merge Details: The merge used
grimjim/Llama-3-Instruct-8B-SPPO-Iter3-SimPO-mergeas its base, combining it withtokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1with a weight of 0.1 for the latter.
Performance Metrics
Evaluations on the Open LLM Leaderboard show an average score of 20.54, with specific results including:
- IFEval (0-Shot): 41.74
- BBH (3-Shot): 28.96
- MMLU-PRO (5-shot): 29.02
Usage Notes
The model was tested with a temperature of 1 and minP of 0.01. Users can adjust the temperature to control creativity. Initial format consistency issues can be mitigated with an Instruct prompt, and specific prompt templates are provided for optimal performance.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.