CharlesLi/llama_2_rlhf_safe_llama_3_70B_reflect_1000_full
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 13, 2025License:llama2Architecture:Transformer Open Weights Cold

This is a 7 billion parameter Llama-2-chat-hf model, fine-tuned by CharlesLi using Reinforcement Learning from Human Feedback (RLHF) for safety and reflection. It is based on the Llama 3 architecture and was trained for 1000 full reflection steps. The model is optimized for generating safer and more reflective responses, making it suitable for applications requiring cautious and thoughtful AI interactions.

Loading preview...