jae24/openhermes_dpo_norobot_0201
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 2, 2024License:mitArchitecture:Transformer Open Weights Cold
The jae24/openhermes_dpo_norobot_0201 is a 7 billion parameter language model, based on the teknium/OpenHermes-2.5-Mistral-7B architecture with a 4096-token context length. This variant has undergone reinforcement learning (RL) fine-tuning using Differential Privacy Optimization (DPO) on a preference dataset derived from HuggingFace's no robots dataset. It is optimized for tasks benefiting from DPO-enhanced fine-tuning.
Loading preview...