Overview
NYTK/PULI-LlumiX-Llama-3.1 is an 8.03 billion parameter language model, continually pretrained from the Llama 3.1 8B Instruct base. Developed by NYTK, this model focuses on enhancing performance for the Hungarian language while retaining strong English capabilities.
Key Capabilities
- Hungarian Language Proficiency: Significantly improved through continued pretraining on 8.08 billion words of Hungarian text, including documents exceeding 5000 words and Hungarian Wikipedia.
- Extended Context Length: Supports a maximum sequence length of 16,384 tokens, enabling processing of longer texts.
- Chat Model Functionality: Inherits chat capabilities from its Llama 3.1 8B Instruct base, making it suitable for conversational AI applications.
- English Language Support: Maintained and enhanced with additional pretraining on English Long Context QA (2 billion words) and BookSum (78 million words) datasets.
- Technical Specifications: Operates with
bfloat16 precision for efficient computation.
Usage
This model can be utilized for various text generation tasks, particularly those involving Hungarian language. Its chat model origins allow for direct application in conversational agents. The model's development involved LLaMA-Factory, and it is recommended to cite the "PULI Chat: Our First Hungarian Conversational Model" paper if used.