logicker/SkkuDS-DPO-72B-v1 is a 72.3 billion parameter language model based on the Qwen1.5 architecture, fine-tuned using Direct Preference Optimization (DPO) on the Intel/orca_dpo_pairs dataset. This model offers stable support for a 32K token context length and includes multilingual capabilities. It is designed for advanced natural language understanding and generation tasks, leveraging its DPO tuning for improved alignment with human preferences.
No reviews yet. Be the first to review!