Overview
logicker/SkkuDS-DPO-72B-v1: DPO-Tuned Qwen1.5-72B
This model, logicker/SkkuDS-DPO-72B-v1, is a 72.3 billion parameter language model built upon the Qwen1.5 architecture. It has undergone Direct Preference Optimization (DPO) using the Intel/orca_dpo_pairs dataset, enhancing its ability to align with human preferences and generate high-quality, instruction-following responses.
Key Capabilities
- DPO Fine-tuning: Optimized for better alignment with human preferences through DPO on a high-quality dataset.
- Multilingual Support: Features improved multilingual capabilities in both its base and chat model forms.
- Extended Context Length: Provides stable support for a substantial 32,768 token context length, enabling processing of longer inputs and generating more coherent, extended outputs.
- Robust Architecture: Based on the Transformer architecture with SwiGLU activation, attention QKV bias, and an improved tokenizer for multiple natural languages and code.
Good for
- Applications requiring highly aligned and preference-tuned language generation.
- Complex tasks benefiting from a large context window, such as summarization of long documents or multi-turn conversations.
- Multilingual natural language processing tasks.
- Research and development in advanced large language models and DPO techniques.