logicker/SkkuDS-DPO-72B-v1

Cold
Public
72.3B
FP8
32768
Feb 15, 2024
License: tongyi-qianwen
Hugging Face
Overview

logicker/SkkuDS-DPO-72B-v1: DPO-Tuned Qwen1.5-72B

This model, logicker/SkkuDS-DPO-72B-v1, is a 72.3 billion parameter language model built upon the Qwen1.5 architecture. It has undergone Direct Preference Optimization (DPO) using the Intel/orca_dpo_pairs dataset, enhancing its ability to align with human preferences and generate high-quality, instruction-following responses.

Key Capabilities

  • DPO Fine-tuning: Optimized for better alignment with human preferences through DPO on a high-quality dataset.
  • Multilingual Support: Features improved multilingual capabilities in both its base and chat model forms.
  • Extended Context Length: Provides stable support for a substantial 32,768 token context length, enabling processing of longer inputs and generating more coherent, extended outputs.
  • Robust Architecture: Based on the Transformer architecture with SwiGLU activation, attention QKV bias, and an improved tokenizer for multiple natural languages and code.

Good for

  • Applications requiring highly aligned and preference-tuned language generation.
  • Complex tasks benefiting from a large context window, such as summarization of long documents or multi-turn conversations.
  • Multilingual natural language processing tasks.
  • Research and development in advanced large language models and DPO techniques.