Name: Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_dare_ties API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_dare_ties, is a 1.5 billion parameter language model created by Zachary1150. It was developed using the mergekit tool, specifically employing the DARE TIES merge method, which is detailed in the paper DARE TIES.

Key Characteristics

Base Model: The merge is built upon deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.
Merged Components: It integrates two distinct actor models, each contributing with a weight and density of 0.5, indicating a balanced combination.
Merge Method: Utilizes the DARE TIES technique, known for its approach to merging models by pruning and re-scaling weights.
Context Length: Features a notable context window of 131072 tokens, allowing for processing very long inputs.

Potential Use Cases

Long-Context Applications: Ideal for tasks requiring extensive contextual understanding, such as summarizing long documents, code analysis, or complex conversational agents.
Resource-Constrained Environments: As a 1.5B parameter model, it offers a balance of capability and efficiency, suitable for deployment where larger models are impractical.
Experimental Merging: Useful for researchers and developers interested in exploring the effects of the DARE TIES merging strategy on specific base models and components.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)