Chaotically/model_sft_dare
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 21, 2026Architecture:Transformer Cold

Chaotically/model_sft_dare is a merged language model created using the Linear DARE method, with base_model_temp as its foundation. It integrates sft_model_temp, applying a density of 0.7 and a weight of 1.0 during the merge process. This model is designed for general language tasks, leveraging the combined strengths of its constituent models through a specific merging technique.

Loading preview...

Overview

Chaotically/model_sft_dare is a merged language model developed using the Linear DARE merge method, a technique detailed in the paper [https://arxiv.org/abs/2311.03099]. This model is built upon base_model_temp and incorporates sft_model_temp to combine their respective capabilities.

Key Capabilities

  • Model Merging: Utilizes the Linear DARE method for combining pre-trained language models.
  • Configurable Integration: Merged with specific parameters, including a density of 0.7 and a weight of 1.0 for sft_model_temp.
  • Base Model Foundation: Leverages base_model_temp as the foundational architecture for the merge.

Good For

  • Experimentation with Merged Models: Ideal for researchers and developers interested in exploring the performance characteristics of models created via the Linear DARE method.
  • Leveraging Combined Strengths: Suitable for tasks that could benefit from the aggregated knowledge and capabilities of base_model_temp and sft_model_temp.
  • Custom Model Development: Provides a starting point for further fine-tuning or integration into specific applications where a merged model approach is desired.