Name: viethq188/Rabbit-7B-v2-DPO-Chat API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: viethq188

Model Overview

viethq188/Rabbit-7B-v2-DPO-Chat is a 7 billion parameter language model developed by viethq188. This model was constructed through a strategic merge of two distinct base models: AIDC-ai-business/Marcoroni-7B-v3 and Q-bert/MetaMath-Cybertron-Starling. The merging process utilized a slerp merge method, specifically configured to blend different layers and attention mechanisms from the source models.

Key Development Steps

Base Model Merging: The initial phase involved combining AIDC-ai-business/Marcoroni-7B-v3 and Q-bert/MetaMath-Cybertron-Starling. The config.yaml details a specific slerp merge strategy, applying varying interpolation values (t) across self-attention and MLP layers.
DPO Fine-tuning: Following the merge, the model underwent further training using Direct Preference Optimization (DPO) on Hugging Face datasets. This step is crucial for aligning the model's outputs with human preferences, enhancing its conversational quality and instruction-following capabilities.

Usage and Template

This model is designed to be used with an Alpaca-style instruction template. Users should format their prompts as follows:

{system}
### Instruction:
{prompt}

### Response:

Intended Use Cases

Chat Applications: Optimized for generating coherent and contextually relevant responses in conversational settings.
Instruction Following: Benefits from DPO training to better understand and execute user instructions.

Overview

Model Overview

Key Development Steps

Usage and Template

Intended Use Cases

Full Model Card (README)