WiroAI/wiroai-turkish-llm-8b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Sep 6, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

WiroAI/wiroai-turkish-llm-8b is an 8 billion parameter decoder-only transformer model developed by WiroAI, built on Meta's LLaMA 3.1 architecture. Fine-tuned with over 500,000 high-quality Turkish instructions, it excels in Turkish language processing tasks, offering strong local context and cultural understanding. This model is optimized for text generation, question answering, summarization, and analysis specifically for Turkish content, with a 32K context length.

Loading preview...

WiroAI/wiroai-turkish-llm-8b: A Turkish-Optimized LLaMA Model

WiroAI/wiroai-turkish-llm-8b is an 8 billion parameter language model developed by WiroAI, based on Meta's LLaMA 3.1 architecture. This model is specifically fine-tuned for the Turkish language and culture, utilizing over 500,000 high-quality Turkish instructions through the LoRA method without quantization. It aims to provide superior performance in Turkish natural language processing tasks.

Key Capabilities

  • Turkish Language Understanding: Demonstrates strong comprehension of Turkish culture, idioms, and local context.
  • Text Generation & Editing: Capable of generating and refining Turkish text.
  • Question Answering & Summarization: Performs well in answering questions and summarizing content in Turkish.
  • Analysis & Reasoning: Supports analytical and reasoning tasks within the Turkish language.
  • Resource Efficiency: Designed for effective operation even with limited hardware resources.

Performance Highlights

The model shows competitive performance in Turkish benchmarks, including MMLU TR, ARC TR, and WinoGrande TR, often outperforming other Turkish-specific models and the base Meta-Llama-3-8B-Instruct on several metrics. For instance, it achieved 52.4 on MMLU TR and 57.0 on WinoGrande TR. It is recommended to use clear and structured instructions for optimal results and to verify outputs for critical applications. The model is provided under an Apache 2.0 license.