ewqr2130/alignment-handbook-zephyr-7b_ppostep_100
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 18, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

The ewqr2130/alignment-handbook-zephyr-7b_ppostep_100 is a 7 billion parameter language model developed by ewqr2130. It is a PPO-tuned variant of the alignment-handbook-zephyr-7b-sft model, having undergone 100 steps of Proximal Policy Optimization. This model is designed for tasks requiring refined alignment and instruction following, building upon its supervised fine-tuned base.

Loading preview...