Overview
Fietje 2 Chat: An Efficient Dutch LLM
Fietje 2 Chat is a 2.7 billion parameter language model developed by Bram Vanroy, specifically optimized for the Dutch language. It is a DPO-tuned (aligned) chat version, building upon the instruct model, which itself is an adaptation of Microsoft's Phi-2 architecture.
Key Capabilities & Features
- Dutch Language Specialization: Tailored for Dutch text generation through training on 28 billion tokens of Dutch data.
- Efficiency: Despite its small size (2.7B parameters), Fietje 2 Chat performs nearly on par with Dutch LLMs twice its size, such as GEITje 7B Ultra.
- DPO-Tuned: Fine-tuned using Direct Preference Optimization (DPO) on a combination of cleaned Dutch datasets, including
ultra_feedback_dutch_cleanedandorca_dpo_pairs_dutch_cleaned, totaling over 18,000 samples. - Chat-Optimized: Designed for conversational applications, offering an aligned model for interactive use cases.
Training Details
The model was trained with the alignment-handbook using DeepSpeed, leveraging computational resources from the Flemish Supercomputer Center (VSC). Training involved specific hyperparameters like a learning rate of 2e-06 and a batch size of 8, over one epoch.
Intended Uses & Limitations
Fietje 2 Chat is intended for Dutch language generation and conversational AI. Users should be aware of general LLM limitations, including potential for hallucinations and inaccuracies, as noted for the base Phi-2 model.