wenbopan/Faro-Yi-34B-DPO
Faro-Yi-34B-DPO is a 34 billion parameter DPO-tuned causal language model developed by wenbopan. This model is an instruction-tuned version of Faro-Yi-34B, which itself is based on the Yi-34B-200K architecture, and excels at various tasks. It demonstrates improved performance over the original Yi-34B-200K, particularly in both short and long context scenarios up to 32768 tokens.
Loading preview...
Overview
wenbopan/Faro-Yi-34B-DPO is a 34 billion parameter language model that has undergone Direct Preference Optimization (DPO). It is derived from the wenbopan/Faro-Yi-34B base model, which in turn builds upon the Yi-34B-200K architecture. This DPO-tuned version significantly enhances performance across a range of tasks compared to its predecessors.
Key Capabilities
- Improved Performance: The DPO tuning has led to substantial improvements over the original Yi-34B-200K model.
- Context Length: The model is designed to perform well in both short and long contexts, supporting a context length of up to 32768 tokens.
- ChatML Template: It utilizes the ChatML template for conversational interactions, making it suitable for assistant-like applications.
Usage
The model can be efficiently used with vllm for high-throughput inference, especially for processing long documents like PDFs, as demonstrated in the provided example. It also supports integration with the transformers library for standard inference workflows.