allenai/tulu-2-dpo-7b is a 7 billion parameter language model developed by AllenAI, fine-tuned from Llama 2 using Direct Preference Optimization (DPO). This model is designed as a helpful assistant, excelling in conversational tasks and offering a strong alternative to Llama 2 7b Chat. It was trained on a diverse mix of publicly available, synthetic, and human datasets, primarily in English.
No reviews yet. Be the first to review!