Model Overview
lgaalves/mistral-7b-platypus1k is a 7 billion parameter instruction-tuned model built upon the Mistral-7B transformer architecture. Developed by Luiz G A Alves, this model is specifically fine-tuned for enhanced performance in STEM and logic-based tasks.
Key Capabilities & Performance
This model excels in reasoning and factual recall, as evidenced by its benchmark results. It was instruction fine-tuned using LoRA on a Tesla V100-SXM2-16GB, leveraging the garage-bAInd/Open-Platypus dataset, which focuses on STEM and logic.
Benchmark Highlights (compared to Mistral-7B-v0.1 and Platypus2-7B):
- Avg. Score: Achieves an average score of 63.66, surpassing Mistral-7B-v0.1 (62.4) and Platypus2-7B (56.13).
- ARC (25-shot): Scores 61.60, outperforming Mistral-7B-v0.1 (59.98) and Platypus2-7B (55.20).
- TruthfulQA (0-shot): Leads with 46.96, compared to Mistral-7B-v0.1 (42.15) and Platypus2-7B (40.64).
- While slightly lower on HellaSwag and MMLU compared to the base Mistral-7B, its overall average and specific strengths in ARC and TruthfulQA highlight its targeted improvements.
Intended Use Cases
This model is particularly well-suited for applications requiring strong logical reasoning, factual accuracy, and performance on STEM-related queries. Its instruction-tuned nature makes it effective for question-answering and instructional tasks in English. Developers should perform safety testing tailored to their specific applications, as with all LLMs.