Changgil/k2s3_test_24001
Changgil/k2s3_test_24001 is a 13 billion parameter language model developed by Changgil Song, fine-tuned from Meta's Llama-2-13b-chat-hf. It was trained on approximately 800 million tokens, including the Standard Korean Dictionary, KULLM data, dissertation abstracts, and AI Hub Korean language samples. This model is specifically optimized for Korean language understanding and generation, leveraging PEFT LoRA techniques for efficient fine-tuning. Its primary strength lies in processing and generating content in Korean, making it suitable for applications requiring robust Korean language capabilities.
Loading preview...
Model Overview
Changgil/k2s3_test_24001 is a 13 billion parameter language model developed by Changgil Song, built upon the meta-llama/Llama-2-13b-chat-hf base model. It has been fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation) techniques to enhance its performance, particularly for Korean language tasks.
Training Details
The model was trained on a diverse dataset of approximately 800 million tokens. This dataset includes significant Korean language resources such as:
- The Standard Korean Dictionary
- KULLM training data from Korea University
- Dissertation abstracts from master's and doctoral theses
- Korean language samples from AI Hub
Training was conducted using two A100 (80G) GPUs, leveraging the HuggingFace SFTtrainer with fsdp for efficient memory usage and accelerated training. Key LoRA parameters included r = 8 and alpha = 16, trained for 2 epochs with a batch size of 1 and gradient accumulation of 32.
Key Capabilities
- Korean Language Proficiency: Optimized for understanding and generating text in Korean due to its specialized training data.
- Efficient Fine-tuning: Utilizes PEFT LoRA, allowing for more efficient adaptation to specific tasks or datasets.
Considerations for Use
When further fine-tuning this model, it is recommended to consider the original training parameters, such as LoRA r and alpha values, to ensure compatibility and achieve optimal performance.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.