neuroturk/HYZ-01-0.6B
HYZ-01-0.6B is a 0.6 billion parameter instruction-tuned causal language model developed by NeuroTürk, specifically optimized for the Turkish language. Built on a multilingual foundation, it underwent extensive Turkish continual pre-training and fine-tuning on 372,697 instruction-response pairs, with an extended tokenizer for Turkish morphology. This lightweight model excels at instruction-following tasks in Turkish, including conversation, question answering, summarization, and code generation, with a training context length of 4,096 tokens and a theoretical maximum of 32,768 tokens.
Loading preview...
Overview
NeuroTürk's HYZ-01-0.6B is a lightweight, instruction-tuned language model with approximately 0.6 billion parameters, primarily designed for the Turkish language. It is built upon a multilingual foundation covering 119 languages, followed by a 4-stage Turkish continual pre-training (CPT) process. The model was then fine-tuned using Supervised Fine-Tuning (SFT) on a dataset of 372,697 high-quality Turkish instruction-response pairs, enhancing its ability to follow instructions across various tasks.
Key Capabilities & Features
- Turkish Language Optimization: Extensive CPT and SFT specifically for Turkish, with an extended tokenizer to handle Turkish morphological structures and advanced use cases.
- Instruction Following: Improved performance in conversation, question answering, summarization, and code generation due to instruction tuning.
- Tokenizer Extensions: Includes new special tokens for Turkish morphological features and structural use cases like chain-of-thought, code blocks, and dialogue management.
- Efficient Architecture: Features a hidden dimension of 1,024, 28 layers, and Grouped-Query Attention (GQA) with 16 attention heads (Q) and 8 (KV).
- Context Length: Supports a training context length of 4,096 tokens, with a theoretical maximum of 32,768 tokens.
- Low Resource Usage: Operates with BFloat16 precision, requiring approximately 1.11 GB of VRAM and disk space.
Benchmarks & Limitations
Evaluations on Turkish-specific benchmarks show scores such as 89.10% on TurBLiMP (ditransitive) and 33.08% on Global MMLU TR (5-shot). While effective for its size, the model may produce occasional incorrect outputs, and its 0.6B parameter count limits complex multi-step reasoning. Performance significantly decreases in languages other than Turkish, and human verification of outputs is recommended for critical applications.