Overview
This model, emillykkejensen/Llama-3-8B-instruct-dansk, is an instruction-tuned variant of Meta's powerful Llama-3-8B base model. It has been specifically fine-tuned to improve its performance and instruction-following capabilities for the Danish language.
Key Characteristics
- Base Model: Meta-Llama-3-8B, an 8 billion parameter causal language model.
- Fine-tuning Dataset: Utilizes the
kobprof/skolegpt-instruct dataset, indicating a focus on Danish instructional data. - Evaluation Performance: Achieved a loss of 0.9477 on its evaluation set, demonstrating its adaptation to the target language and task.
- Context Length: Inherits the 8192 token context length from the Llama-3-8B base.
Training Details
The model was trained with a learning rate of 2e-05, a batch size of 1 per device across 4 GPUs, and a cosine learning rate scheduler with a 0.2 warmup ratio over 1 epoch. The optimizer used was Adam with betas=(0.9, 0.999) and epsilon=1e-08.
Intended Use Cases
This model is particularly well-suited for applications requiring instruction-following in Danish, such as:
- Generating Danish text based on specific prompts.
- Danish language chatbots or conversational AI.
- Educational tools or content creation in Danish.
- Any task where a robust, instruction-tuned Danish LLM is beneficial.