hfl/llama-3-chinese-8b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 22, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

hfl/llama-3-chinese-8b is an 8 billion parameter language model developed by hfl, further pre-trained on Meta-Llama-3-8B with an additional 120 GB of Chinese text corpora. This foundation model is specifically designed for enhanced Chinese language understanding and generation, building upon the Llama 3 architecture. It features an 8192-token context length and is optimized for tasks requiring strong Chinese linguistic capabilities, serving as a base for further fine-tuning.

Loading preview...

Llama-3-Chinese-8B Overview

Llama-3-Chinese-8B is an 8 billion parameter foundation model developed by hfl, building upon the robust Meta-Llama-3-8B architecture. Its primary differentiator is extensive further pre-training using 120 GB of Chinese text corpora, significantly enhancing its proficiency in the Chinese language.

Key Characteristics

  • Base Model: Meta-Llama-3-8B, providing a strong general-purpose linguistic foundation.
  • Chinese Language Enhancement: Specialized pre-training on a large volume of Chinese text data for superior performance in Chinese contexts.
  • Parameter Count: 8 billion parameters, offering a balance between capability and computational efficiency.
  • Context Length: Supports an 8192-token context window.

Important Considerations

This model is released as a foundation model. This means it is primarily intended as a base for further development and fine-tuning. It is not directly suitable for conversational AI, question-answering, or similar instruction-following tasks without additional fine-tuning. Developers should consider this model for applications where strong Chinese language understanding and generation are critical, and where subsequent fine-tuning for specific downstream tasks is planned.

For more detailed information, including performance benchmarks and usage guidelines, refer to the official GitHub project page.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p