kyujinpy/KoT-platypus2-13B
The kyujinpy/KoT-platypus2-13B is a 13 billion parameter auto-regressive language model developed by Kyujin Han, based on the LLaMA2 transformer architecture. Fine-tuned from KO-Platypus2-13B, it integrates Chain-of-Thought (CoT) reasoning using the KoCoT_2000 dataset, which is a Korean translation of the kaist-CoT dataset. This model is specifically optimized for Korean language tasks, demonstrating improved performance on the Open KO-LLM LeaderBoard benchmarks compared to its base model.
Loading preview...
KoT-platypus2-13B: Korean Chain-of-Thought LLaMA2 Model
KoT-platypus2-13B is a 13 billion parameter auto-regressive language model developed by Kyujin Han, built upon the LLaMA2 transformer architecture. This model is a fine-tuned version of the existing KO-Platypus2-13B, specifically enhanced with Chain-of-Thought (CoT) capabilities.
Key Capabilities & Features
- Architecture: Based on the robust LLaMA2 transformer architecture.
- Chain-of-Thought Integration: Incorporates CoT reasoning by fine-tuning on the KoCoT_2000 dataset, a Korean translation of the kaist-CoT dataset.
- Korean Language Optimization: Designed and trained to excel in Korean language understanding and generation tasks.
- Performance: Achieves an average score of 49.55 on the Open KO-LLM LeaderBoard, outperforming its base model (KO-Platypus2-13B) and several other comparable Korean LLMs in specific benchmarks like Ko-CommonGen V2.
- Training: Trained using A100 GPU 40GB with hyperparameters including a batch size of 64, 15 epochs, and a learning rate of 1e-5, with a cutoff length of 4096.
Ideal Use Cases
- Korean NLP Applications: Suitable for various natural language processing tasks requiring strong Korean language comprehension and generation.
- Reasoning Tasks: Benefits from its Chain-of-Thought fine-tuning, making it potentially more effective for complex reasoning in Korean.
- Research and Development: A valuable resource for researchers and developers working on Korean LLMs and exploring CoT methodologies.