Name: Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_ties_density0.2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_ties_density0.2, is a 1.5 billion parameter language model created by Zachary1150. It was developed using the TIES merge method (Task-Agnostic In-Context Example Selection) to combine the strengths of two distinct pre-trained models. The base model for this merge was deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.

Merge Details

The merge process involved two specific models, each contributing with a weight of 0.5 and a density of 0.2, indicating a balanced integration of their features. The TIES method is known for its ability to efficiently combine models while preserving their individual expertise, suggesting this merge aims for a synergistic outcome rather than a simple average.

Key Characteristics

Architecture: Based on the DeepSeek-R1-Distill-Qwen-1.5B family.
Parameter Count: 1.5 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Features a substantial context window of 131072 tokens, enabling processing of very long inputs and maintaining extensive conversational history or document understanding.
Merge Method: Utilizes the TIES method, which is designed to selectively merge parameters, potentially leading to specialized capabilities derived from its source models.

Potential Use Cases

Given its large context window and specialized merge approach, this model could be particularly effective for:

Applications requiring deep contextual understanding over long documents.
Tasks benefiting from the combined strengths of its merged components, such as enhanced reasoning or specific formatting adherence.
Scenarios where a 1.5B parameter model offers a good trade-off between performance and resource usage.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)