Rama-adi/test-merge

TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

Rama-adi/test-merge is a merged language model created using the TIES (Trimming, Iterative Retraining, and Selective Weight Averaging) method. It combines TheDrummer/Moistral-11B-v1 with Sao10K/Fimbulvetr-11B-v2, using the latter as its base model. This merge aims to leverage the strengths of its constituent 11 billion parameter models, offering a potentially enhanced general-purpose language model. Its primary characteristic is its origin as a mergekit composition, focusing on combining existing model capabilities.

Loading preview...

Model Overview

Rama-adi/test-merge is a language model created through a merging process using mergekit. This model specifically utilizes the TIES (Trimming, Iterative Retraining, and Selective Weight Averaging) merge method, which is designed to combine the parameters of multiple pre-trained models effectively.

Merge Details

The base model for this merge is Sao10K/Fimbulvetr-11B-v2. It was merged with TheDrummer/Moistral-11B-v1. The configuration applied specific density and weight parameters to each contributing model, with Moistral-11B-v1 having a weight of 0.95 and Fimbulvetr-11B-v2 a weight of 0.05, both with a density of 0.5. The merge process also included normalization and used float16 for its data type.

Key Characteristics

  • Merge-based Architecture: Built by combining two existing 11 billion parameter models.
  • TIES Method: Employs a specific merging technique known for its parameter efficiency.
  • Leverages Pre-trained Strengths: Aims to inherit and combine the capabilities of its constituent models, Sao10K/Fimbulvetr-11B-v2 and TheDrummer/Moistral-11B-v1.

Potential Use Cases

This model is suitable for users looking to experiment with or deploy a language model that integrates the characteristics of its merged components. It could be particularly useful for tasks where the combined strengths of the base models are beneficial, offering a potentially versatile solution for general language generation and understanding tasks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p