Name: thu-ml/STAIR-Llama-3.1-8B-SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: thu-ml

Model Overview

thu-ml/STAIR-Llama-3.1-8B-SFT is an 8 billion parameter instruction-tuned model, building upon the meta-llama/Llama-3.1-8B-Instruct architecture. Developed by thu-ml, this model is a core component of the STAIR framework, designed for enhanced reasoning and self-improvement capabilities.

Key Capabilities

Step-level Chain-of-Thought (CoT) Reasoning: The model is fine-tuned on the STAIR-SFT dataset, which comprises 20,000 prompts from UltraFeedback and PKU-SafeRLHF, all formatted with step-level CoT answers. This training enables the model to produce detailed, step-by-step reasoning processes.
Ethical and Safety Alignment: As demonstrated by its handling of sensitive queries, the model is designed to provide safe and ethical responses, refusing to engage in harmful or illegal requests while offering appropriate guidance.
Structured Output: Responses are structured with <|Reasoning_step|> and <|Output|> tags, allowing for easy extraction of final answers and analysis of the reasoning process.

Use Cases

Complex Problem Solving: Ideal for applications requiring transparent, step-by-step reasoning to arrive at a solution.
Content Moderation and Safety: Can be employed in scenarios where ethical considerations and refusal to generate harmful content are paramount.
Educational Tools: Useful for generating explanations and thought processes behind answers, aiding in learning and understanding.

More details on the framework and usage can be found on the STAIR GitHub Repository.

Overview

Model Overview

Key Capabilities

Use Cases

Full Model Card (README)