RatanRohith/NeuralPizza-7B-V0.1
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 12, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

RatanRohith/NeuralPizza-7B-V0.1 is a 7 billion parameter language model fine-tuned from SanjiWatsuki/Kunoichi-7B using Direct Preference Optimization (DPO). It specializes in exploring DPO techniques for language model tuning, utilizing the Intel/orca_dpo_pairs dataset. This model is primarily intended for research and experimental applications in language modeling, offering insights into DPO nuances.

Loading preview...