allenai/tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jun 11, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

The allenai/tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm is a 13 billion parameter language model developed by AllenAI, fine-tuned from Llama-2-13b-hf. This model is specifically trained using PPO with a 70B reward model and UltraFeedback prompts, making it a helpful assistant. It excels in generating conversational responses aligned with user preferences, building upon the Tulu V2 suite.

Loading preview...