aloobun/Cypher-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 4, 2024License:ccArchitecture:Transformer0.0K Cold

Cypher-7B by aloobun is a 7 billion parameter language model merged using the SLERP method, combining NousResearch/Nous-Hermes-2-Mistral-7B-DPO and cognitivecomputations/samantha-1.1-westlake-7b-laser. This model leverages the strengths of its base models to offer enhanced performance, particularly benefiting from the DPO alignment of Nous-Hermes-2 and the capabilities of Samantha-1.1. It is designed for general-purpose language tasks, providing a balanced blend of instruction following and conversational abilities.

Loading preview...

Cypher-7B: A SLERP Merged 7B Language Model

Cypher-7B is a 7 billion parameter language model developed by aloobun, created through a strategic merge of two prominent base models: NousResearch/Nous-Hermes-2-Mistral-7B-DPO and cognitivecomputations/samantha-1.1-westlake-7b-laser. This model utilizes the SLERP (Spherical Linear Interpolation) merge method, which is known for effectively combining the strengths of different models while maintaining coherence.

Key Characteristics

  • Merged Architecture: Combines the robust instruction-following and DPO (Direct Preference Optimization) alignment of Nous-Hermes-2-Mistral-7B-DPO with the conversational and reasoning capabilities of samantha-1.1-westlake-7b-laser.
  • Parameter Count: A 7 billion parameter model, offering a balance between performance and computational efficiency.
  • Merge Method: Employs the SLERP method, with specific parameter weighting applied to different model components like lm_head, embed_tokens, self_attn, mlp, layernorm, and modelnorm to optimize the merged outcome.

Intended Use Cases

Cypher-7B is well-suited for a variety of general-purpose natural language processing tasks, including:

  • Instruction Following: Benefiting from the DPO-aligned base model, it can effectively follow user instructions.
  • Conversational AI: Leverages the conversational strengths of its Samantha-1.1 component.
  • Text Generation: Capable of generating coherent and contextually relevant text.

This model provides a versatile option for developers seeking a 7B model that integrates the best features of its constituent parts.