Name: Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR1e-6_w0.5_ties API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, merge_lenfmt_MRL4096_ROLLOUT4_LR1e-6_w0.5_ties, is a 1.5 billion parameter language model created by Zachary1150. It was developed using the TIES (Trimming and Merging Fine-tuned Models) merge method, which combines multiple pre-trained language models into a single, more capable model. The base model for this merge is deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.

Merge Details

The merge process involved two specific checkpoints, each contributing to the final model's characteristics. The configuration used a weight of 0.5 and a density of 0.5 for each contributing model, indicating an equal contribution from both sources. The merge was performed with normalize: true and dtype: bfloat16 settings, ensuring a balanced integration of the merged components.

Key Characteristics

Architecture: Merged model based on DeepSeek-R1-Distill-Qwen-1.5B.
Parameter Count: 1.5 billion parameters.
Merge Method: Utilizes the TIES method for combining models, which is known for effectively integrating knowledge from multiple fine-tuned checkpoints.

Potential Use Cases

This model is suitable for various natural language processing tasks, benefiting from the combined strengths of its constituent models. Its 1.5B parameter size makes it efficient for deployment while still offering robust language capabilities.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)