garage-bAInd/Platypus2-13B
Platypus2-13B is a 13 billion parameter instruction fine-tuned language model developed by Cole Hunter and Ariel Lee, based on the LLaMA2 transformer architecture. It is specifically trained on STEM and logic-based datasets, making it particularly effective for tasks requiring reasoning and factual accuracy. This model is designed for English language applications and is licensed under CC BY-NC-4.0.
Loading preview...
Platypus2-13B: A STEM and Logic-Focused LLaMA2 Fine-Tune
Platypus2-13B is a 13 billion parameter instruction fine-tuned model built upon the LLaMA2 transformer architecture, developed by Cole Hunter and Ariel Lee. This model is distinguished by its specialized training on the garage-bAInd/Open-Platypus dataset, which is heavily focused on STEM (Science, Technology, Engineering, and Mathematics) and logic-based content.
Key Capabilities & Differentiators
- Specialized Training: Instruction fine-tuned specifically on STEM and logic-based data, enhancing its performance in these domains.
- LLaMA2 Foundation: Benefits from the robust architecture of LLaMA2-13B.
- Efficient Fine-tuning: Trained using LoRA on a single A100 80GB GPU, demonstrating efficient resource utilization.
- English Language Focus: Optimized for tasks and applications in English.
Performance Highlights (Open LLM Leaderboard)
Platypus2-13B shows competitive performance on various benchmarks, with an average score of 48.04. Notable results include:
- ARC (25-shot): 61.26
- HellaSwag (10-shot): 82.56
- MMLU (5-shot): 56.7
- TruthfulQA (0-shot): 44.86
Use Cases & Considerations
This model is particularly well-suited for applications requiring strong performance in scientific, technical, and logical reasoning tasks. Developers should be aware of the non-commercial Creative Commons license (CC BY-NC-4.0) for its base weights. As with all LLMs, users should conduct safety testing and tuning for specific applications, as the model may produce varied responses.