Name: Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_dare_ties API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_dare_ties is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the mergekit tool, specifically employing the DARE TIES merge method.

Merge Details

This model's foundation is deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B. It integrates two distinct actor models, combined with equal weighting (0.5) and density (0.5) parameters. The merge process utilized a bfloat16 data type and included normalization.

Key Characteristics

Merge Method: DARE TIES, known for its ability to combine models effectively while preserving performance.
Base Model: Built upon deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, providing a strong foundational architecture.
Component Models: Incorporates two specific actor models, suggesting an optimization for tasks where their combined expertise is beneficial.

Potential Use Cases

Given its merged nature, this model is likely suitable for applications requiring a blend of capabilities from its constituent models, potentially offering improved performance on specific tasks compared to its individual components. Its 1.5B parameter size makes it efficient for deployment while still offering robust language understanding and generation.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)