Name: kyujinpy/KoT-platypus2-13B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kyujinpy

KoT-platypus2-13B: Korean Chain-of-Thought LLaMA2 Model

KoT-platypus2-13B is a 13 billion parameter auto-regressive language model developed by Kyujin Han, built upon the LLaMA2 transformer architecture. This model is a fine-tuned version of the existing KO-Platypus2-13B, specifically enhanced with Chain-of-Thought (CoT) capabilities.

Key Capabilities & Features

Architecture: Based on the robust LLaMA2 transformer architecture.
Chain-of-Thought Integration: Incorporates CoT reasoning by fine-tuning on the KoCoT_2000 dataset, a Korean translation of the kaist-CoT dataset.
Korean Language Optimization: Designed and trained to excel in Korean language understanding and generation tasks.
Performance: Achieves an average score of 49.55 on the Open KO-LLM LeaderBoard, outperforming its base model (KO-Platypus2-13B) and several other comparable Korean LLMs in specific benchmarks like Ko-CommonGen V2.
Training: Trained using A100 GPU 40GB with hyperparameters including a batch size of 64, 15 epochs, and a learning rate of 1e-5, with a cutoff length of 4096.

Ideal Use Cases

Korean NLP Applications: Suitable for various natural language processing tasks requiring strong Korean language comprehension and generation.
Reasoning Tasks: Benefits from its Chain-of-Thought fine-tuning, making it potentially more effective for complex reasoning in Korean.
Research and Development: A valuable resource for researchers and developers working on Korean LLMs and exploring CoT methodologies.

Overview

KoT-platypus2-13B: Korean Chain-of-Thought LLaMA2 Model

Key Capabilities & Features

Ideal Use Cases

Full Model Card (README)