Name: Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the MergeKit tool, specifically employing the Linear merge method to combine two distinct pre-trained language models. The merging process involved actor checkpoints from baselines_openrs/acc_MRL4096_ROLLOUT4_LR5e-7 and baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR5e-7, with a weighted average applied (0.7 for the first model, 0.3 for the second).

Key Characteristics

Architecture: Merged model using the Linear method.
Parameter Count: 1.5 billion parameters.
Context Length: Features a substantial 131,072 token context window.
Merging Configuration: Utilizes bfloat16 dtype and includes normalization during the merge.

Potential Use Cases

This model is suitable for developers and researchers looking for:

A compact language model derived from a weighted merge of specialized actor checkpoints.
Applications benefiting from a very large context window (131k tokens).
Experimentation with merged model architectures for specific tasks where the constituent models excel.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)