BramVanroy/fietje-2
Fietje 2 is a 2.7 billion parameter causal language model developed by Bram Vanroy, adapted from Microsoft's Phi-2 architecture. It is specifically fine-tuned for Dutch text generation, having been continuously pretrained on 28 billion Dutch tokens. This efficient model offers performance comparable to larger Dutch LLMs, making it suitable for various Dutch language processing tasks.
Loading preview...
Fietje 2: An Efficient Dutch LLM
Fietje 2, developed by Bram Vanroy, is a 2.7 billion parameter language model based on the microsoft/phi-2 architecture. It has been specifically adapted and continuously pretrained on 28 billion Dutch tokens, including a significant portion of Dutch Wikipedia and CulturaX data, to excel in Dutch text generation.
Key Capabilities & Features
- Dutch Language Proficiency: Optimized for generating and understanding Dutch text through extensive pretraining on a high-quality Dutch dataset.
- Efficiency: Despite its relatively small size (2.7B parameters), Fietje 2 demonstrates performance nearly on par with larger Dutch LLMs, such as GEITje 7B Ultra, offering a more efficient solution.
- Open-source Foundation: Built upon the
phi-2model, inheriting its architectural strengths. - Multiple Versions: Available in base, instruct, and chat variants to suit different application needs.
Training Details
Fietje 2 was trained for approximately two weeks using 16 A100 80GB GPUs, leveraging the alignment-handbook and DeepSpeed. The training involved a learning rate of 9e-05 and a total batch size of 1920, focusing on achieving high data quality through careful filtering of the training corpus.
Intended Uses & Limitations
Fietje 2 is designed for Dutch language generation tasks. Users should be aware of general LLM limitations, including the potential for hallucinations and inaccuracies, similar to its base model, phi-2.