Fietje 2 Chat: An Efficient Dutch LLM

Fietje 2 Chat is a 2.7 billion parameter language model developed by Bram Vanroy, specifically optimized for the Dutch language. It is a DPO-tuned (aligned) chat version, building upon the instruct model, which itself is an adaptation of Microsoft's Phi-2 architecture.

Key Capabilities & Features

Dutch Language Specialization: Tailored for Dutch text generation through training on 28 billion tokens of Dutch data.
Efficiency: Despite its small size (2.7B parameters), Fietje 2 Chat performs nearly on par with Dutch LLMs twice its size, such as GEITje 7B Ultra.
DPO-Tuned: Fine-tuned using Direct Preference Optimization (DPO) on a combination of cleaned Dutch datasets, including ultra_feedback_dutch_cleaned and orca_dpo_pairs_dutch_cleaned, totaling over 18,000 samples.
Chat-Optimized: Designed for conversational applications, offering an aligned model for interactive use cases.

Training Details

The model was trained with the alignment-handbook using DeepSpeed, leveraging computational resources from the Flemish Supercomputer Center (VSC). Training involved specific hyperparameters like a learning rate of 2e-06 and a batch size of 8, over one epoch.

Intended Uses & Limitations

Fietje 2 Chat is intended for Dutch language generation and conversational AI. Users should be aware of general LLM limitations, including potential for hallucinations and inaccuracies, as noted for the base Phi-2 model.

Overview

Fietje 2 Chat: An Efficient Dutch LLM

Key Capabilities & Features

Training Details

Intended Uses & Limitations

Full Model Card (README)