NovusResearch/Thestral-0.1-tr-chat-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 15, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

NovusResearch/Thestral-0.1-tr-chat-7B is a 7 billion parameter causal language model, fine-tuned from Mistral-7B-v0.1, specifically optimized for Turkish language tasks. This model leverages diverse translated Turkish datasets, including versions of OpenHermes-2.5 and SlimOrca, to enhance its conversational capabilities in Turkish. It is designed for chat-based applications requiring strong performance in the Turkish language.

Loading preview...

Thestral-0.1-tr-chat-7B: Turkish-Optimized Chat Model

The Thestral-0.1-tr-chat-7B is a 7 billion parameter language model developed by NovusResearch, built upon the Mistral-7B-v0.1 architecture. Its primary distinction lies in its comprehensive fine-tuning on a wide array of Turkish datasets, making it highly proficient in Turkish language understanding and generation.

Key Capabilities & Training

  • Turkish Language Specialization: The model has undergone full fine-tuning using translated Turkish datasets, including versions of teknium/OpenHermes-2.5 and Open-Orca/SlimOrca, ensuring strong performance in Turkish conversational contexts.
  • Base Model: Utilizes mistralai/Mistral-7B-v0.1 as its foundational architecture, inheriting its robust causal language modeling capabilities.
  • Training Configuration: Fine-tuned using axolotl with a sequence length of 8192 and a learning rate of 0.000005 over 2 epochs, incorporating techniques like gradient checkpointing and Flash Attention.

Performance Metrics (Turkish Leaderboard)

Evaluated on the OpenLLMTurkishLeaderboard, the model demonstrates an average score of 36.41, with specific scores including:

  • MMLU: 40.64
  • TruthfulQA: 47.90
  • Winogrande: 50.86
  • AI2 Reasoning Challenge: 27.24
  • HellaSwag: 33.90
  • GSM8k: 17.91

Ideal Use Cases

  • Turkish Chatbots: Excellent for developing conversational AI agents that interact in Turkish.
  • Turkish Content Generation: Suitable for generating text, summaries, or creative content in Turkish.
  • Research & Development: A strong base for further fine-tuning or research into Turkish NLP applications.