Name: miulab/llama2-7b-alpaca-sft-10k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: miulab

Model Overview

The miulab/llama2-7b-alpaca-sft-10k is a 7 billion parameter language model built upon the Llama 2 architecture. It has been fine-tuned using Supervised Fine-Tuning (SFT) on a dataset of 10,000 Alpaca-style instructions, providing it with general instruction-following capabilities.

Key Characteristics

Architecture: Llama 2 (7B parameters)
Training: Supervised Fine-Tuning (SFT) on 10,000 Alpaca-style instructions.
Research Context: This model is specifically highlighted as the foundational SFT model used in the research paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging".

Intended Use Cases

This model is primarily intended for:

Research in Reward Modeling: Serving as a base for further development and experimentation with reward models, particularly in the context of integrating domain-specific knowledge.
Understanding SFT Impact: Investigating the effects of SFT on Llama 2 models for instruction following.
Academic Exploration: Supporting studies related to model merging techniques and their application in enhancing LLM capabilities.

Detailed training and evaluation information can be found via the provided Weights & Biases link in the original paper's resources.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)