jingyeom/KoSoLAR-10.7B-v0.2_1.3_dedup_p

TEXT GENERATIONConcurrency Cost:1Model Size:15BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:Jan 23, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

KoSoLAR-10.7B-v0.2_1.3_dedup_p is a 15 billion parameter causal language model developed by jingyeom, based on the yanolja/KoSOLAR-10.7B-v0.2 architecture. This model was trained using a deduplicated public dataset, leveraging the "Deduplicating Training Data Makes Language Models Better" algorithm. It is designed for general language generation tasks, with a focus on improved performance through data deduplication.

Loading preview...

Model Overview

jingyeom/KoSoLAR-10.7B-v0.2_1.3_dedup_p is a 15 billion parameter causal language model built upon the yanolja/KoSOLAR-10.7B-v0.2 base architecture. This iteration, developed by jingyeom, incorporates a key training methodology: the use of a deduplicated public dataset. The training process specifically applied the algorithm described in the paper "Deduplicating Training Data Makes Language Models Better," aiming to enhance model quality and performance by reducing data redundancy.

Key Characteristics

  • Base Model: Utilizes the robust yanolja/KoSOLAR-10.7B-v0.2 as its foundation.
  • Parameter Count: Features 15 billion parameters, offering substantial capacity for complex language tasks.
  • Training Data Strategy: Employs a data deduplication technique during training, which is known to improve model generalization and reduce memorization of specific training examples.
  • Context Length: Supports a context window of 8192 tokens, allowing for processing longer inputs and generating more coherent, extended outputs.

Potential Use Cases

This model is suitable for a variety of natural language processing applications where a strong understanding of context and generation of high-quality text are crucial. Its training methodology suggests potential benefits in tasks requiring robust generalization and reduced susceptibility to data artifacts. Developers can integrate it using the Hugging Face Transformers library for tasks such as text generation, summarization, and question answering. Performance can be further evaluated on the Ko-LLM-Leaderboard.