Name: Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR1e-6_w0.5_dare_ties_density0.2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR1e-6_w0.5_dare_ties_density0.2, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the DARE TIES merge method, which combines the weights of multiple pre-trained models. The base model for this merge is deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.

Merge Details

The model integrates two distinct pre-trained components: acc_MRL4096 and accfmt_MRL4096. Each component contributed with a weight of 0.5 and a density of 0.2 during the merging process. This approach aims to leverage the specific capabilities or knowledge encoded within each source model, potentially leading to a more robust or specialized merged model. The DARE TIES method, as described in its associated research, is known for its ability to effectively combine models while mitigating issues like catastrophic forgetting.

Key Characteristics

Architecture: Merged model based on deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.
Parameter Count: 1.5 billion parameters.
Context Length: Supports a substantial context window of 131072 tokens.
Merge Method: Utilizes the DARE TIES technique for combining model weights.

Potential Use Cases

Given its merged nature and the specific components involved, this model is likely suitable for applications where the combined expertise of the source models is beneficial. Its large context window also makes it applicable for tasks requiring extensive input understanding or generation.