shleeeee/mistral-ko-7b-wiki-neft

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kArchitecture:Transformer Cold

The shleeeee/mistral-ko-7b-wiki-neft model is a fine-tuned version of the Mistral-7B-v0.1 architecture, developed by shleeeee (Seunghyeon Lee) and oopsung (Sungwoo Park). This 7 billion parameter model is specifically optimized for Korean language tasks, leveraging a custom Korean dataset and the NEFT (Noise-Enhanced Fine-Tuning) technique. It is designed for general-purpose text generation and understanding in Korean.

Loading preview...

Overview

The shleeeee/mistral-ko-7b-wiki-neft is a 7 billion parameter language model, fine-tuned from the Mistral-7B-v0.1 architecture. Developed by shleeeee (Seunghyeon Lee) and oopsung (Sungwoo Park), this model is specifically enhanced for the Korean language.

Key Characteristics

  • Base Model: Mistral-7B-v0.1.
  • Language Focus: Optimized for Korean using a custom Korean dataset.
  • Fine-tuning Method: Incorporates NEFT (Noise-Enhanced Fine-Tuning) with a neftune_noise_alpha of 5.
  • LoRA Target Modules: Fine-tuned using LoRA on q_proj, k_proj, v_proj, o_proj, and gate_proj modules.
  • Training Details: Trained for 1000 steps with a train_batch size of 4.

Usage and Evaluation

The model utilizes the standard Mistral prompt template: <s>[INST]{['instruction']}[/INST]{['output']}</s>. While specific benchmark scores are not detailed, an evaluation image is provided in the original model card, indicating performance assessment. This model is suitable for various Korean natural language processing tasks, including text generation and comprehension.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p