allenai/tulu-v2.5-dpo-13b-hh-rlhf
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jun 11, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The allenai/tulu-v2.5-dpo-13b-hh-rlhf model is a 13 billion parameter language model developed by AllenAI, fine-tuned from Llama-2-13b-hf using DPO (Direct Preference Optimization) on the HH-RLHF dataset. It is part of the Tulu V2.5 series, designed to function as a helpful assistant. This model specializes in generating responses aligned with human preferences, leveraging advanced RLHF techniques for improved conversational quality.

Loading preview...