Overview

NeuroTürk's HYZ-01-0.6B is a lightweight, instruction-tuned language model with approximately 0.6 billion parameters, primarily designed for the Turkish language. It is built upon a multilingual foundation covering 119 languages, followed by a 4-stage Turkish continual pre-training (CPT) process. The model was then fine-tuned using Supervised Fine-Tuning (SFT) on a dataset of 372,697 high-quality Turkish instruction-response pairs, enhancing its ability to follow instructions across various tasks.

Key Capabilities & Features

Turkish Language Optimization: Extensive CPT and SFT specifically for Turkish, with an extended tokenizer to handle Turkish morphological structures and advanced use cases.
Instruction Following: Improved performance in conversation, question answering, summarization, and code generation due to instruction tuning.
Tokenizer Extensions: Includes new special tokens for Turkish morphological features and structural use cases like chain-of-thought, code blocks, and dialogue management.
Efficient Architecture: Features a hidden dimension of 1,024, 28 layers, and Grouped-Query Attention (GQA) with 16 attention heads (Q) and 8 (KV).
Context Length: Supports a training context length of 4,096 tokens, with a theoretical maximum of 32,768 tokens.
Low Resource Usage: Operates with BFloat16 precision, requiring approximately 1.11 GB of VRAM and disk space.

Benchmarks & Limitations

Evaluations on Turkish-specific benchmarks show scores such as 89.10% on TurBLiMP (ditransitive) and 33.08% on Global MMLU TR (5-shot). While effective for its size, the model may produce occasional incorrect outputs, and its 0.6B parameter count limits complex multi-step reasoning. Performance significantly decreases in languages other than Turkish, and human verification of outputs is recommended for critical applications.

Overview

Overview

Key Capabilities & Features

Benchmarks & Limitations

Full Model Card (README)