walid-iguider/Llama-3-8B-4bit-UltraChat-Ita
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 3, 2024License:apache-2.0Architecture:Transformer Open Weights Warm
The walid-iguider/Llama-3-8B-4bit-UltraChat-Ita is an 8 billion parameter Llama-3 model developed by walid-iguider, fine-tuned from unsloth/llama-3-8b-bnb-4bit. This model is specifically optimized for Italian language tasks, demonstrating competitive performance on Italian benchmarks such as hellaswag_it, arc_it, and m_mmlu_it. It is designed for applications requiring efficient and accurate Italian language processing.
Loading preview...
Overview
This model, developed by walid-iguider, is an 8 billion parameter Llama-3 variant, fine-tuned from unsloth/llama-3-8b-bnb-4bit. It was trained using Unsloth for 2x faster processing and Huggingface's TRL library. The model is licensed under Apache-2.0.
Key Capabilities
- Italian Language Proficiency: Specifically fine-tuned and evaluated for performance in Italian.
- Efficient Training: Utilizes Unsloth for accelerated training, suggesting potential for efficient deployment.
- Benchmark Performance: Achieves an average normalized accuracy of 0.5334 across key Italian benchmarks, including:
hellaswag_it acc_norm: 0.6064arc_it acc_norm: 0.4611m_mmlu_it 5-shot acc: 0.5328
Good For
- Applications requiring a capable large language model for Italian text generation and understanding.
- Developers looking for an efficiently trained Llama-3 based model with a focus on Italian language tasks.
- Use cases where performance on Italian-specific benchmarks is critical.