allknowingroger/Calmesmol-7B-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kLicense:apache-2.0Architecture:Transformer Open Weights Cold

allknowingroger/Calmesmol-7B-slerp is a 7 billion parameter language model created by allknowingroger, formed by a slerp merge of MaziyarPanahi/Calme-7B-Instruct-v0.9 and rishiraj/smol-7b. This model leverages the strengths of its constituent models through a specific merging technique, making it suitable for general instruction-following tasks. Its architecture is designed to combine the capabilities of both base models for enhanced performance.

Loading preview...

Model Overview

Calmesmol-7B-slerp is a 7 billion parameter language model developed by allknowingroger. It is a product of a slerp merge (spherical linear interpolation) of two distinct base models:

  • MaziyarPanahi/Calme-7B-Instruct-v0.9
  • rishiraj/smol-7b

This merging technique aims to combine the desirable characteristics of both models, potentially leading to a more balanced and capable instruction-following model.

Key Characteristics

  • Merge Method: Utilizes the slerp (spherical linear interpolation) method, which is often employed to smoothly combine the weights of different models.
  • Configuration: The merge was performed using LazyMergekit, with specific t parameters applied differently to self-attention and MLP layers to fine-tune the merge outcome.

Intended Use Cases

This model is primarily intended for general instruction-following tasks, benefiting from the combined capabilities of its merged predecessors. Developers can integrate it into applications requiring a 7B parameter model for various natural language processing tasks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p