danielpark/gorani-100k-llama2-13b-instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Oct 4, 2023Architecture:Transformer0.0K Warm

danielpark/gorani-100k-llama2-13b-instruct is a 13 billion parameter instruction-tuned model based on Meta's Llama-2-13b-chat. Developed by danielpark, this model is part of the GORANI project, which focuses on experimenting with dataset distribution for knowledge transfer and distillation, primarily for research purposes. It explores optimal datasets for various languages and specific domains, with a particular emphasis on refining a commercially usable Korean dataset.

Loading preview...

What is danielpark/gorani-100k-llama2-13b-instruct?

This model is a 13 billion parameter instruction-tuned variant of Meta's Llama-2-13b-chat, developed by danielpark. It is an experimental model stemming from the GORANI project, which investigates the distribution of datasets to transfer or distill knowledge from English datasets, aiming to find optimal datasets for various languages and specific domains.

Key Characteristics:

  • Base Model: Llama-2-13b-chat, a 13 billion parameter model.
  • Project Focus: Part of the GORANI project, which is a research initiative exploring dataset optimization for knowledge transfer.
  • Experimental Techniques: Preliminary experiments included techniques like RoPE scaling, Attention Sinks, Flash Attention 1 and 2, SWA (Sliding Window Attention), and GQA (Grouped Query Attention).
  • Licensing: Currently under the strict LLaMA2 license and CC-BY-NC-4.0 due to dataset licenses, prohibiting commercial use.
  • Research-Oriented: Primarily intended for research purposes, with a future goal of developing a commercially usable Korean dataset (KORANI) based on GORANI's findings.

Important Considerations:

  • Non-Commercial Use Only: Due to strict licensing, this model is not to be used for commercial purposes.
  • Experimental Status: The model weights are not considered official and are subject to change.
  • Private Project: As of November 2023, the project was made private, with potential future release in a non-public format on cloud platforms.