kyujinpy/PlatYi-34B-LoRA

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Dec 1, 2023License:cc-by-nc-sa-4.0Architecture:Transformer0.0K Open Weights Cold

PlatYi-34B-LoRA by Kyujin Han is a 34 billion parameter auto-regressive language model built on the Yi-34B transformer architecture, fine-tuned using LoRA on the Open-Platypus dataset. This model demonstrates strong performance across various benchmarks, particularly excelling in MMLU and HellaSwag, making it suitable for general-purpose reasoning and language understanding tasks. It supports a 32768 token context length, offering robust capabilities for complex prompts.

Loading preview...

PlatYi-34B-LoRA Overview

PlatYi-34B-LoRA is a 34 billion parameter auto-regressive language model developed by Kyujin Han. It is based on the robust 01-ai/Yi-34B transformer architecture and was fine-tuned using LoRA (Low-Rank Adaptation) with a lora_r value of 16. The training utilized the garage-bAInd/Open-Platypus dataset.

Key Capabilities & Performance

This model demonstrates competitive performance on the Open LLM Leaderboard, with an average score of 68.10. Notable benchmark results include:

  • MMLU (5-Shot): 78.46
  • HellaSwag (10-Shot): 85.37
  • Winogrande (5-shot): 83.66
  • AI2 Reasoning Challenge (25-Shot): 67.15

Compared to its base model, Yi-34B, PlatYi-34B-LoRA shows improvements in MMLU and Winogrande, indicating enhanced reasoning and common-sense understanding. The model processes text inputs and generates text outputs, supporting a context length of 32768 tokens.

Use Cases

PlatYi-34B-LoRA is well-suited for applications requiring strong general language understanding, reasoning, and question-answering, particularly where MMLU and HellaSwag performance are critical. Its LoRA fine-tuning approach makes it an efficient adaptation of a powerful base model.