Overview
VeriUS-LLM-8b-v0.2 is an 8 billion parameter instruction-following large language model developed by VeriUs. It is built upon the unsloth/llama-3-8b-bnb-4bit base model and has been fine-tuned using QLoRA and ORPO with a carefully curated general domain Turkish instruction dataset. This specialization makes it particularly adept at understanding and generating content in Turkish.
Key Capabilities
- Turkish Language Proficiency: Optimized for instruction following in Turkish, leveraging a dedicated Turkish instruction dataset during fine-tuning.
- Efficient Inference: Designed for fast inference, with specific support and usage examples provided for the Unsloth library.
- Llama 3 Architecture: Benefits from the robust Llama 3 base model, providing a strong foundation for language tasks.
Training Details
The model was fine-tuned with specific training arguments including a PER_DEVICE_BATCH_SIZE of 2, GRADIENT_ACCUMULATION_STEPS of 4, and a learning rate of 0.000008 over 2 epochs. PEFT arguments included a RANK of 128 and LORA_ALPHA of 256, targeting key projection modules.
Limitations and Considerations
- Primary Function: As an autoregressive language model, its core function is next-token prediction. While versatile, it has not undergone extensive real-world application testing.
- Language Nuances: While fine-tuned for Turkish, its performance with slang, informal language, or other languages may be limited, potentially leading to errors.
- Potential for False Information: Users should be aware that the model may generate inaccurate or misleading information, and outputs should be treated as suggestions rather than definitive answers.