Name: Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_ties API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_ties is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the TIES (Trimming, Iterative Merging, and Self-Distillation) merge method, which combines multiple pre-trained models into a single, more capable model. The base model for this merge is deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, indicating a foundation rooted in the Qwen architecture.

Merge Details

This model integrates two specific actor models, each contributing with a weight of 0.5 and a density of 0.5, as defined by the TIES configuration. The merging process aimed to consolidate the capabilities of these individual components, potentially enhancing overall performance or specializing in certain tasks. The model supports a very large context window of 131072 tokens, making it suitable for processing extensive inputs or generating long-form content.

Potential Use Cases

Applications requiring a model with a very long context window.
Scenarios where combining the strengths of multiple specialized models is beneficial.
Research into model merging techniques and their practical applications.

Overview

Model Overview

Merge Details

Potential Use Cases

Full Model Card (README)