khaled314/qwen25-saudi-v4
The khaled314/qwen25-saudi-v4 is a 7.6 billion parameter causal language model, fine-tuned from Qwen/Qwen2.5-7B-Instruct. This model is specifically optimized for tasks involving the Saudi dialect, having been trained on the saudi_dialect_train dataset. It is designed to enhance performance in applications requiring understanding and generation of text in this specific Arabic dialect, leveraging a 32768 token context length.
Loading preview...
Overview
The khaled314/qwen25-saudi-v4 model is a specialized large language model, fine-tuned from the robust Qwen/Qwen2.5-7B-Instruct base. With 7.6 billion parameters and a substantial 32768 token context length, its primary distinction lies in its targeted training on the saudi_dialect_train dataset. This focused fine-tuning aims to significantly improve its proficiency and accuracy when processing or generating content in the Saudi dialect.
Key Capabilities
- Saudi Dialect Proficiency: Enhanced understanding and generation of text specific to the Saudi dialect.
- Instruction Following: Inherits instruction-following capabilities from its Qwen2.5-7B-Instruct base.
- Large Context Window: Benefits from a 32768 token context length, suitable for handling longer inputs and maintaining conversational coherence.
Training Details
The model was trained with a learning rate of 2e-05 over 3 epochs, utilizing an AdamW optimizer and a cosine learning rate scheduler. It achieved a training loss of 4.6754 on the evaluation set.
Good For
- Applications requiring high accuracy in Saudi dialect text processing.
- Chatbots or virtual assistants designed for users in Saudi Arabia.
- Content generation, translation, or analysis tasks specifically targeting the Saudi dialect.