vilm/Quyen-Pro-Max-v0.1

Cold
Public
72.3B
FP8
32768
Feb 4, 2024
License: other
Hugging Face
Overview

Quyen-Pro-Max-v0.1 Overview

Quyen-Pro-Max-v0.1 is the largest model in the Quyen series developed by vilm, built upon the Qwen1.5 architecture. This 72.3 billion parameter model is designed for advanced language tasks, offering a substantial 32768-token context window for processing longer inputs and generating comprehensive responses.

Key Capabilities & Training

The model was developed using a combination of Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). Its training involved a rich and varied dataset, including publicly available resources like OpenHermes-2.5, Capyabara, and argilla/distilabel-capybara-dpo-7k-binarized, alongside private data from Ontocord and BEE-spoke-data. This diverse training regimen aims to enhance its conversational abilities and general intelligence.

Prompting and Usage

All Quyen models, including Quyen-Pro-Max-v0.1, utilize the ChatML format as their default prompt template, ensuring consistent interaction. Users can easily integrate this by applying the tokenizer.apply_chat_template function for structured input generation.

Good For

  • General-purpose conversational AI: Its large parameter count and extensive training make it suitable for complex dialogue and understanding.
  • Applications requiring long context: The 32768-token context length is beneficial for tasks involving detailed documents or extended conversations.
  • Developers familiar with Qwen1.5: Leveraging the established Qwen1.5 family architecture, it offers a powerful option for various NLP tasks.