rwitz2/mergemix

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Dec 11, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

rwitz2/mergemix is a DARE-TIES merged language model based on the Mistral-7B-v0.1 architecture. This model combines Mistral-7B-v0.1 with rwitz/go-bruins-v2, rwitz/dec10, and AIDC-ai-business/Marcoroni-7B-v3 using specific weights and densities. It is configured for bfloat16 dtype and includes an int8_mask parameter. The model's specific capabilities and primary use cases are not detailed in the provided information.

Loading preview...

Model Overview

rwitz2/mergemix is a DARE-TIES (Disentangled Representation Editing for Task-Specific Information Extraction) merged language model. It is built upon the mistralai/Mistral-7B-v0.1 base model, integrating components from three other models: rwitz/go-bruins-v2, rwitz/dec10, and AIDC-ai-business/Marcoroni-7B-v3.

Merge Configuration

The merge process utilizes specific parameters for each contributing model:

  • rwitz/go-bruins-v2: weight 0.4, density 0.6
  • rwitz/dec10: weight 0.2, density 0.5
  • AIDC-ai-business/Marcoroni-7B-v3: weight 0.4, density 0.6

Technical Details

The model is configured to use bfloat16 data type and includes an int8_mask parameter. Further details regarding its specific capabilities, intended direct or downstream uses, training data, evaluation metrics, and performance benchmarks are not provided in the current model card.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p