Name: Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.3_linear API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.3_linear, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Mergekit framework, specifically employing the Linear merge method to combine the strengths of two distinct pre-trained base models. The model features a very large context window of 131072 tokens, making it suitable for tasks requiring extensive contextual understanding.

Merge Details

The merge process involved two base models, with specific checkpoints weighted to create the final model. The configuration assigned a weight of 0.3 to one base model and 0.7 to the other, with normalization applied and bfloat16 precision used for the merge. This approach aims to synthesize the capabilities of the constituent models into a unified architecture.

Potential Use Cases

Long-context applications: The 131072 token context length makes it highly suitable for processing and generating very long documents, code, or conversations.
General language tasks: As a merged model, it is expected to perform well across a range of natural language understanding and generation tasks, benefiting from the diverse training of its base components.

Overview

Model Overview

Merge Details

Potential Use Cases

Full Model Card (README)