Quyen-Pro-Max-v0.1 Overview
Quyen-Pro-Max-v0.1 is the largest model in the Quyen series developed by vilm, built upon the Qwen1.5 architecture. This 72.3 billion parameter model is designed for advanced language tasks, offering a substantial 32768-token context window for processing longer inputs and generating comprehensive responses.
Key Capabilities & Training
The model was developed using a combination of Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). Its training involved a rich and varied dataset, including publicly available resources like OpenHermes-2.5, Capyabara, and argilla/distilabel-capybara-dpo-7k-binarized, alongside private data from Ontocord and BEE-spoke-data. This diverse training regimen aims to enhance its conversational abilities and general intelligence.
Prompting and Usage
All Quyen models, including Quyen-Pro-Max-v0.1, utilize the ChatML format as their default prompt template, ensuring consistent interaction. Users can easily integrate this by applying the tokenizer.apply_chat_template function for structured input generation.
Good For
- General-purpose conversational AI: Its large parameter count and extensive training make it suitable for complex dialogue and understanding.
- Applications requiring long context: The 32768-token context length is beneficial for tasks involving detailed documents or extended conversations.
- Developers familiar with Qwen1.5: Leveraging the established Qwen1.5 family architecture, it offers a powerful option for various NLP tasks.