UCLA-AGI/Mistral7B-PairRM-SPPO-Iter2
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:May 4, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

UCLA-AGI/Mistral7B-PairRM-SPPO-Iter2 is a 7 billion parameter GPT-like model developed by UCLA-AGI, fine-tuned from Mistral-7B-Instruct-v0.2. This model utilizes Self-Play Preference Optimization (SPPO) at its second iteration, specifically aligned using synthetic datasets derived from UltraFeedback. It is designed for improved alignment and response quality, as evidenced by its performance on various benchmarks including AlpacaEval and MT-Bench.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p