allenai/tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jun 11, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

The allenai/tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm is a 13 billion parameter language model developed by AllenAI, fine-tuned from Llama-2-13b-hf. It is part of the Tulu V2.5 series, specifically trained using Proximal Policy Optimization (PPO) with a 13B preference mixture reward model and UltraFeedback prompts. This model is designed to function as a helpful assistant, excelling in chat-based interactions and general instruction following.

Loading preview...