kyujinpy/PlatYi-34B-LoRA
PlatYi-34B-LoRA by Kyujin Han is a 34 billion parameter auto-regressive language model built on the Yi-34B transformer architecture, fine-tuned using LoRA on the Open-Platypus dataset. This model demonstrates strong performance across various benchmarks, particularly excelling in MMLU and HellaSwag, making it suitable for general-purpose reasoning and language understanding tasks. It supports a 32768 token context length, offering robust capabilities for complex prompts.
Loading preview...
PlatYi-34B-LoRA Overview
PlatYi-34B-LoRA is a 34 billion parameter auto-regressive language model developed by Kyujin Han. It is based on the robust 01-ai/Yi-34B transformer architecture and was fine-tuned using LoRA (Low-Rank Adaptation) with a lora_r value of 16. The training utilized the garage-bAInd/Open-Platypus dataset.
Key Capabilities & Performance
This model demonstrates competitive performance on the Open LLM Leaderboard, with an average score of 68.10. Notable benchmark results include:
- MMLU (5-Shot): 78.46
- HellaSwag (10-Shot): 85.37
- Winogrande (5-shot): 83.66
- AI2 Reasoning Challenge (25-Shot): 67.15
Compared to its base model, Yi-34B, PlatYi-34B-LoRA shows improvements in MMLU and Winogrande, indicating enhanced reasoning and common-sense understanding. The model processes text inputs and generates text outputs, supporting a context length of 32768 tokens.
Use Cases
PlatYi-34B-LoRA is well-suited for applications requiring strong general language understanding, reasoning, and question-answering, particularly where MMLU and HellaSwag performance are critical. Its LoRA fine-tuning approach makes it an efficient adaptation of a powerful base model.