flemmingmiguel/Distilled-HermesChat-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 12, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

flemmingmiguel/Distilled-HermesChat-7B is a 7 billion parameter language model created by flemmingmiguel, formed by merging openchat/openchat-3.5-0106 and argilla/distilabeled-Hermes-2.5-Mistral-7B. This model is an experimental merge designed to identify optimal base configurations for further fine-tuning. It leverages a slerp merge method with specific parameter adjustments for self-attention and MLP layers, making it suitable for general conversational AI tasks and as a foundation for specialized applications.

Loading preview...

Overview

Distilled-HermesChat-7B is an experimental 7 billion parameter language model developed by flemmingmiguel. It is constructed through a merge of two distinct models: openchat/openchat-3.5-0106 and argilla/distilabeled-Hermes-2.5-Mistral-7B. The primary goal of this merge is to explore and identify the most effective base model combination for subsequent fine-tuning efforts.

Key Characteristics

  • Architecture: A merged model combining elements from OpenChat and Hermes 2.5 Mistral-7B.
  • Merge Method: Utilizes the slerp (spherical linear interpolation) merge method, with specific t parameter adjustments applied to self-attention and MLP layers to balance contributions from the source models.
  • Experimental Nature: Positioned as an experiment to benchmark and determine a strong foundational merge for further development.

Good For

  • General Conversational AI: Suitable for a wide range of chat-based applications due to its lineage from instruction-tuned models.
  • Base for Fine-tuning: Designed as a robust starting point for developers looking to fine-tune a model for specific tasks or domains.
  • Research and Experimentation: Ideal for researchers interested in model merging techniques and their impact on performance.