Name: Changgil/k2s3_test_24001 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Changgil

Model Overview

Changgil/k2s3_test_24001 is a 13 billion parameter language model developed by Changgil Song, built upon the meta-llama/Llama-2-13b-chat-hf base model. It has been fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation) techniques to enhance its performance, particularly for Korean language tasks.

Training Details

The model was trained on a diverse dataset of approximately 800 million tokens. This dataset includes significant Korean language resources such as:

The Standard Korean Dictionary
KULLM training data from Korea University
Dissertation abstracts from master's and doctoral theses
Korean language samples from AI Hub

Training was conducted using two A100 (80G) GPUs, leveraging the HuggingFace SFTtrainer with fsdp for efficient memory usage and accelerated training. Key LoRA parameters included r = 8 and alpha = 16, trained for 2 epochs with a batch size of 1 and gradient accumulation of 32.

Key Capabilities

Korean Language Proficiency: Optimized for understanding and generating text in Korean due to its specialized training data.
Efficient Fine-tuning: Utilizes PEFT LoRA, allowing for more efficient adaptation to specific tasks or datasets.

Considerations for Use

When further fine-tuning this model, it is recommended to consider the original training parameters, such as LoRA r and alpha values, to ensure compatibility and achieve optimal performance.

Overview

Model Overview

Training Details

Key Capabilities

Considerations for Use

Full Model Card (README)