Name: Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_dare_ties API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_dare_ties is a 1.5 billion parameter language model developed by Zachary1150. It was created using the DARE TIES merge method, a technique designed to combine multiple pre-trained language models effectively. The base model for this merge is deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, providing a strong foundation for its capabilities.

Merge Details

This model integrates two distinct pre-trained models, both with specific configurations, into a unified architecture. The merge process utilized a weight of 0.5 and a density of 0.5 for each contributing model, ensuring a balanced combination. The DARE TIES method, as described in its associated research, aims to improve performance by selectively merging parameters.

Key Characteristics

Architecture: Merged model based on deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.
Parameter Count: 1.5 billion parameters.
Merge Method: DARE TIES, known for its parameter merging efficiency.
Context Length: Supports a context length of 131072 tokens.

Potential Use Cases

This model is suitable for a variety of natural language processing tasks, including text generation, summarization, and question answering, benefiting from the combined strengths of its constituent models.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)