Changgil/K2S3-SOLAR-11b-v3.0

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Mar 14, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

Changgil/K2S3-SOLAR-11b-v3.0 is a 10.7 billion parameter language model developed by K2S3, fine-tuned from the upstage/SOLAR-10.7B-v1.0 base model. It was trained using Supervised Fine-Tuning (SFT) on a diverse dataset including the Standard Korean Dictionary, KULLM data, academic abstracts, AI Hub Korean samples, alpaca-gpt4-data, and The OpenOrca Dataset. This model is optimized for general language tasks with a focus on incorporating Korean language data.

Loading preview...

K2S3-SOLAR-11b-v3.0 Overview

K2S3-SOLAR-11b-v3.0 is a 10.7 billion parameter language model developed by K2S3, built upon the upstage/SOLAR-10.7B-v1.0 base model. This model was fine-tuned using a full parameter tuning method with Supervised Fine-Tuning (SFT).

Key Training Details

  • Base Model: upstage/SOLAR-10.7B-v1.0
  • Training Method: Supervised Fine-Tuning (SFT) with full parameter tuning, utilizing HuggingFace SFTtrainer and fsdp.
  • Hardware: Training was conducted on two A100 (80G*2EA) GPUs.

Training Data

The model's training dataset is comprehensive, incorporating a mix of general and Korean-specific linguistic resources:

  • Standard Korean Dictionary
  • KULLM training data from Korea University
  • Abstracts of master's and doctoral theses
  • Korean language samples from AI Hub
  • alpaca-gpt4-data
  • Samples from The OpenOrca Dataset

This diverse data mix aims to enhance the model's general language understanding and generation capabilities, with a notable inclusion of Korean linguistic resources.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p