viethq188/Rabbit-7B-DPO-Chat

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Dec 12, 2023License:apache-2.0Architecture:Transformer Open Weights Cold

viethq188/Rabbit-7B-DPO-Chat is a 7 billion parameter language model created by viethq188, developed by merging AIDC-ai-business/Marcoroni-7B-v3 and rwitz/go-bruins-v2 using a slerp merge method. This model was subsequently fine-tuned with DPO (Direct Preference Optimization) using Hugging Face data, resulting in a model optimized for chat-based interactions. It supports an 8192-token context length and is designed for general conversational applications.

Loading preview...

Rabbit-7B-DPO-Chat Overview

Rabbit-7B-DPO-Chat is a 7 billion parameter language model developed by viethq188. It was created through a slerp merge of two base models: AIDC-ai-business/Marcoroni-7B-v3 and rwitz/go-bruins-v2. This merging process combined the strengths of both models across their 32 layers, with specific parameter weighting applied to self-attention and MLP components.

Following the merge, the model underwent further refinement using Direct Preference Optimization (DPO), leveraging data from Hugging Face. This DPO training aims to align the model's outputs more closely with human preferences, enhancing its conversational quality and helpfulness.

Key Characteristics

  • Architecture: Merged model based on Marcoroni-7B-v3 and go-bruins-v2.
  • Parameter Count: 7 billion parameters.
  • Context Length: Supports an 8192-token context window.
  • Fine-tuning: Utilizes DPO for improved conversational performance.
  • Template: Designed to work with the Alpaca instruction template for chat interactions.

Intended Use Cases

This model is suitable for various chat-based applications where a 7B parameter model with DPO-enhanced conversational abilities is beneficial. Its training with the Alpaca template suggests its readiness for instruction-following and general dialogue generation.