Name: OpenRLHF/Llama-3-8b-sft-mixture API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: OpenRLHF

Overview

OpenRLHF/Llama-3-8b-sft-mixture is an 8 billion parameter language model derived from Meta's Llama 3 architecture. Developed by OpenRLHF, this model is a supervised fine-tuning (SFT) checkpoint, specifically designed as a foundational model for subsequent Reinforcement Learning from Human Feedback (RLHF) research. It has been trained for one epoch on a comprehensive mixture of high-quality, open-source datasets.

Key Capabilities

Strong SFT Baseline: Provides a robust starting point for RLHF experiments, having undergone extensive supervised fine-tuning.
Diverse Data Training: Trained on a wide array of datasets including ShareGPT, Evol-Instruct, SlimOrca, MathInstruct, Magicoder-Evol-Instruct, GPT4-LLM, OrcaMath, GPTeacher, and UltraInteract, enhancing its general conversational and instructional abilities.
Llama 3 Foundation: Benefits from the advanced architecture and pre-training of the Meta-Llama-3-8B model.

Good For

RLHF Research: Ideal for researchers and developers looking for a solid SFT model to begin their RLHF training pipelines.
General Purpose Applications: Suitable for various language generation and understanding tasks due to its diverse training data.
Instruction Following: Exhibits strong instruction-following capabilities from its fine-tuning on instructional datasets.