OwenArli/ArliAI-Llama-3-8B-Instruct-DPO-v0.2
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer Warm

OwenArli/ArliAI-Llama-3-8B-Instruct-DPO-v0.2 is an 8 billion parameter instruction-tuned language model based on Meta-Llama-3-8B-Instruct, featuring an 8192 token context length. This model utilizes Direct Preference Optimization (DPO) training on the mlabonne/orpo-dpo-mix-40k dataset, aiming to refine its instruction-following capabilities. It is a new version addressing previous tokenization issues, though current open LLM benchmark results are noted as unexpectedly low.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p