Name: BramVanroy/GEITje-7B-ultra-sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: BramVanroy

GEITje-7B-ultra-sft: A Conversational Dutch LLM

BramVanroy/GEITje-7B-ultra-sft is a 7 billion parameter instruction-tuned model, building upon the Mistral 7B-based Rijgersberg/GEITje-7B, with further pretraining on Dutch data. This model is specifically fine-tuned for conversational use, leveraging a diverse set of synthetic datasets totaling approximately 240 million tokens, including data generated by GPT-3.5-turbo and GPT-4-turbo.

Key Capabilities & Training Insights

Dutch Conversational AI: Optimized for multi-turn conversations in Dutch, incorporating various user personas (e.g., language learners, experts, children) during training to enhance adaptability.
Synthetic Data Training: Trained on a unique blend of translated and newly generated Dutch datasets, with 85.42% from BramVanroy/ultrachat_200k_dutch (GPT-4-turbo, multi-turn) and significant contributions from StackOverflow, Alpaca, and Dolly datasets.
Context Length: Supports an 8192-token context length, enabling more extensive and coherent conversations.
Training Methodology: Trained in full (without LoRA) using bfloat16 and Flash Attention 2, following the Hugging Face alignment handbook.
System Message Compatibility: Utilizes the Zephyr chat template, allowing for the inclusion of system messages in conversations.

Important Considerations

Alignment: This model is an SFT (chat-tuned) version and has not been aligned with DPO or other reinforcement learning techniques. For aligned use, the DPO variant is recommended.
Commercial Use: Due to its training on synthetic data derived from OpenAI/Azure services, this model is not suitable for commercial purposes.
Limitations: As an unaligned model, it may generate inaccurate, misleading, or potentially offensive content. Users should exercise caution and use it at their own risk.

Overview

GEITje-7B-ultra-sft: A Conversational Dutch LLM

Key Capabilities & Training Insights

Important Considerations

Full Model Card (README)