Changgil/K2S3-SOLAR-11b-v1.0
Changgil/K2S3-SOLAR-11b-v1.0 is a 15 billion parameter instruction-tuned language model developed by K2S3, fine-tuned from the upstage/SOLAR-10.7B-Instruct-v1.0 base model. It was specifically trained on a comprehensive dataset of Korean language resources, including the Standard Korean Dictionary and academic abstracts, to enhance its proficiency in Korean. This model excels in Korean language understanding and generation tasks, making it suitable for applications requiring high-quality Korean text processing.
Loading preview...
K2S3-SOLAR-11b-v1.0: Korean Language Optimized LLM
K2S3-SOLAR-11b-v1.0 is a 15 billion parameter language model developed by K2S3, built upon the upstage/SOLAR-10.7B-Instruct-v1.0 base model. This model has undergone Supervised Fine-Tuning (SFT) using a full parameter tuning method, specifically targeting enhanced performance in the Korean language.
Key Capabilities & Training Details
- Korean Language Specialization: The model's training data includes a rich collection of Korean resources such as the Standard Korean Dictionary, training data from Korea University's KULLM, abstracts of master's and doctoral theses, and Korean language samples from AI Hub. This extensive dataset ensures strong proficiency in Korean understanding and generation.
- Fine-tuning Process: Fine-tuned with the HuggingFace SFTtrainer and fsdp, the process involved adding new Korean tokens trained with the SentencePieceBPETokenizer. Training was conducted for 2 epochs with a batch size of 1 and a gradient accumulation of 32.
- Hardware: Training was performed using two A100 (80G*2EA) GPUs.
Ideal Use Cases
- Korean Text Generation: Excellent for generating high-quality, contextually relevant Korean text.
- Korean Language Understanding: Suitable for tasks requiring deep comprehension of Korean, such as summarization, translation, or question answering in Korean.
- Research and Development: A strong foundation for further research and development in Korean natural language processing.