Name: Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_ties_density0.2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_ties_density0.2, is a 1.5 billion parameter language model created by Zachary1150. It was developed using the TIES merge method from mergekit, combining multiple pre-trained language models into a single, optimized entity. The base model for this merge is deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, providing a strong foundation for the merged architecture.

Key Characteristics

Merge Method: Utilizes the TIES (Trimming, Iterative, and Selective) merging technique, which is designed to efficiently combine the knowledge from different models.
Base Model: Built upon deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, inheriting its foundational capabilities.
Constituent Models: The merge incorporates two specific actor models: acc_MRL4096_ROLLOUT4_LR2e-6 and accfmt_MRL4096_ROLLOUT4_LR2e-6, each contributing with a weight of 0.5 and a density of 0.2.
Context Length: Supports a substantial context window of 131072 tokens, allowing for processing and generating longer sequences of text.

Potential Use Cases

Given its merge-based construction from specialized actor models, this model is likely suitable for applications that benefit from the combined expertise of its components. Developers interested in exploring the effects of TIES merging on specific fine-tuned models, particularly those derived from the DeepSeek-R1-Distill-Qwen-1.5B family, may find this model valuable for research and development.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)