RLHFlow/LLaMA3-SFT-v2
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Sep 28, 2024Architecture:Transformer0.0K Warm

RLHFlow/LLaMA3-SFT-v2 is an 8 billion parameter instruction-tuned causal language model developed by RLHFlow, based on Meta-Llama-3-8B. This model is a supervised fine-tuning (SFT) checkpoint, specifically designed for use within the RLHFlow/Online-RLHF project. It demonstrates strong performance across academic benchmarks, particularly excelling in mathematical reasoning (GSM8K, MATH) and code generation (HumanEval) compared to its base model.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p