etri-xainlp/llama2-13b-lima-sft-dpo
The etri-xainlp/llama2-13b-lima-sft-dpo model is a 13 billion parameter language model developed by the ETRI xainlp team, built upon the Llama-2-13b-hf base architecture. This model is specifically fine-tuned using a combination of supervised fine-tuning (SFT) with 650k instruction-following examples, LIMA SFT with 280k examples, and further optimized with Direct Preference Optimization (DPO) using 90k user preference sets. It is designed to excel in instruction-following tasks, making it suitable for applications requiring precise and aligned responses.
Loading preview...
Model Overview
The etri-xainlp/llama2-13b-lima-sft-dpo is a 13 billion parameter language model developed by the ETRI xainlp team. It is based on the robust meta-llama/Llama-2-13b-hf architecture, enhanced through a multi-stage fine-tuning process to improve its instruction-following capabilities and alignment with user preferences.
Key Capabilities
- Instruction Following: The model has undergone extensive supervised fine-tuning (SFT) with a large dataset of 650,000 instruction-following examples, ensuring a strong foundation for understanding and executing commands.
- LIMA-style SFT: Further refined with 280,000 LIMA-style instruction-following examples, which typically focus on high-quality, diverse instructions to improve generalization.
- Preference Alignment (DPO): Utilizes Direct Preference Optimization (DPO) on 90,000 user preference sets, aligning the model's outputs more closely with human preferences and desired behaviors.
- Text-in, Text-out: Designed for standard text-based input and output, making it versatile for various natural language processing tasks.
Good For
- Applications requiring a model with strong instruction-following abilities.
- Tasks where alignment with human preferences is crucial.
- Developing chatbots, virtual assistants, or systems that need to generate precise and contextually relevant responses based on explicit instructions.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.