Name: Zachary1150/merge_linear_len0.1fmt0.9_MRL4096_ROLLOUT4_LR1e-6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, Zachary1150/merge_linear_len0.1fmt0.9_MRL4096_ROLLOUT4_LR1e-6, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via mergekit, combining two distinct pre-trained language models.

Merge Details

The model integrates two base models, each contributing to specific characteristics:

One base model, weighted at 0.1, appears to focus on "length" (len_MRL4096_ROLLOUT4_LR1e-6).
The second base model, weighted at 0.9, appears to emphasize "format" (fmt_MRL4096_ROLLOUT4).

This specific weighting suggests an optimization strategy where the "format" characteristics are prioritized, while still incorporating elements from the "length" focused model. The merge process utilized a bfloat16 data type and included normalization.

Key Characteristics

Architecture: Merged model based on pre-trained language models.
Parameter Count: 1.5 billion parameters.
Context Length: Features a very long context window of 131072 tokens.

Potential Use Cases

Given its merged nature and substantial context length, this model could be particularly effective for applications requiring:

Processing and generating long-form content while adhering to specific formatting requirements.
Tasks where understanding and maintaining context over extended text sequences is crucial.
Experiments in model merging and exploring the effects of weighted combinations of specialized base models.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)