Name: etri-xainlp/SOLAR-10.7B-sft-dpo-v1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: etri-xainlp

Model Overview

etri-xainlp/SOLAR-10.7B-sft-dpo-v1 is a 10.7 billion parameter language model developed by the ETRI xainlp team. It is based on the davidkim205/nox-solar-10.7b-v4 architecture and processes text inputs to generate text outputs.

Key Training Details

This model underwent a two-stage fine-tuning process:

Supervised Fine-Tuning (SFT): The model was initially fine-tuned using a LoRA (Low-Rank Adaptation) approach on a substantial dataset comprising 1,821,734 instruction-following examples. This stage aims to enhance the model's ability to follow instructions and generate coherent responses.
Direct Preference Optimization (DPO): Following SFT, the model was further optimized using DPO, also with LoRA, on a dataset of 221,869 user preference examples. This stage refines the model's outputs to better align with human preferences and quality judgments.

Training was conducted using A100 GPU 80GB * 8, indicating a significant computational investment to achieve its current performance. The combination of SFT and DPO aims to produce a model that is both capable of following complex instructions and generating outputs preferred by users.

Overview

Model Overview

Key Training Details

Full Model Card (README)