AurelPx/NeuralPipe-7B-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 21, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

AurelPx/NeuralPipe-7B-slerp is a 7 billion parameter language model created by AurelPx, resulting from a slerp merge of OpenPipe/mistral-ft-optimized-1218 and mlabonne/NeuralHermes-2.5-Mistral-7B. This model leverages the strengths of its base components, offering a balanced performance profile for general-purpose text generation and instruction-following tasks. It is designed for applications requiring a capable 7B model with a 4096-token context window.

Loading preview...

NeuralPipe-7B-slerp Overview

NeuralPipe-7B-slerp is a 7 billion parameter language model developed by AurelPx, created through a slerp merge of two distinct base models: OpenPipe/mistral-ft-optimized-1218 and mlabonne/NeuralHermes-2.5-Mistral-7B. This merging technique, facilitated by LazyMergekit, aims to combine the strengths and capabilities of its constituent models.

Key Characteristics

  • Architecture: Based on the Mistral architecture, inheriting its efficiency and performance characteristics.
  • Parameter Count: 7 billion parameters, offering a balance between computational efficiency and strong language understanding.
  • Context Length: Supports a context window of 4096 tokens, suitable for a variety of conversational and document-based tasks.
  • Merge Method: Utilizes the slerp (spherical linear interpolation) merge method, which is known for creating stable and effective combinations of models by interpolating their weights.
  • Configuration: The merge specifically interpolates weights across all 32 layers of the base models, with varying interpolation ratios applied to self-attention and MLP blocks to optimize performance.

Ideal Use Cases

  • General Text Generation: Capable of generating coherent and contextually relevant text for a wide range of prompts.
  • Instruction Following: Benefits from the instruction-tuned nature of its base models, making it effective for tasks requiring precise adherence to instructions.
  • Experimentation: Provides a solid foundation for developers and researchers looking to experiment with merged models or fine-tune a capable 7B model for specific applications.