allenai/llama-3.1-tulu-2-dpo-8b
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Aug 9, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The allenai/llama-3.1-tulu-2-dpo-8b is an 8 billion parameter instruction-tuned language model developed by AllenAI, fine-tuned from Meta-Llama-3.1-8B. It was initially trained on a diverse mix of publicly available, synthetic, and human datasets, then further aligned using Direct Preference Optimization (DPO) on the UltraFeedback dataset. This model is designed to act as a helpful assistant, demonstrating improved performance in areas like truthfulness and instruction following compared to its base model, with a 32768 token context length.

Loading preview...