bhenrym14/mistral-7b-platypus-fp16
The bhenrym14/mistral-7b-platypus-fp16 is a 7 billion parameter instruction-tuned language model based on the Mistral-7B-v0.1 architecture. Developed by bhenrym14, it was fine-tuned using the Open-Platypus dataset to enhance its instruction-following capabilities. This model demonstrates competitive performance on benchmarks like MMLU (64.20) and ARC (62.80), making it suitable for general-purpose instruction-based tasks.
Loading preview...
Overview
bhenrym14/mistral-7b-platypus-fp16 is a 7 billion parameter instruction-tuned model built upon the Mistral-7B-v0.1 base architecture. It was fine-tuned by bhenrym14 using the Open-Platypus dataset, a collection designed to improve instruction-following abilities. The fine-tuning process involved a QLoRA method with a rank of 64.
Key Capabilities & Performance
This model demonstrates solid performance across several benchmarks:
- ARC (25 shot): 62.80
- Hellaswag (10 shot): 84.12
- MMLU (5 shot): 64.20
While its perplexity performance at shorter context lengths (512 tokens) is strong, the README notes that its competitiveness with larger models or those with advanced context extension techniques may be impacted at longer contexts, potentially due to its sliding window attention mechanism or model size.
Usage Notes
Users should be aware that the Mistral architecture currently requires installing transformers from source. The model was trained with a legacy Airoboros system prompt (version <2.0), and details for prompting can be found in the model card for bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.