etri-xainlp/SOLAR-10.7B-merge-dpo

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kLicense:cc-by-nc-4.0Architecture:Transformer Open Weights Warm

The etri-xainlp/SOLAR-10.7B-merge-dpo model, developed by the ETRI xainlp team, is a 10.7 billion parameter language model. It was created by merging heavytail/kullm-solar into upstage/SOLAR-10.7B-Instruct-v1.0 using MergeKit. This model is fine-tuned with a 90k user preference dataset using DPO and LoRA, making it suitable for text-only input and output tasks.

Loading preview...

Model Overview

etri-xainlp/SOLAR-10.7B-merge-dpo is a 10.7 billion parameter language model developed by the ETRI xainlp team. This model is a result of merging two existing models: heavytail/kullm-solar and upstage/SOLAR-10.7B-Instruct-v1.0, utilizing MergeKit for the integration process.

Key Characteristics

  • Architecture: Built upon the SOLAR-10.7B-Instruct-v1.0 base model, enhanced by merging with kullm-solar.
  • Training: Fine-tuned using a combination of Direct Preference Optimization (DPO) and LoRA (Low-Rank Adaptation) on a 90,000-entry user preference dataset.
  • Input/Output: Designed to process text-only inputs and generate text-only outputs.
  • Development: Training was conducted using an A100 GPU with 80GB memory.

Use Cases

This model is suitable for applications requiring text generation and understanding, particularly where the benefits of DPO fine-tuning on user preferences are valuable. Its merged architecture suggests a potential for combining strengths from its constituent models, offering a versatile option for various NLP tasks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p