Name: Zachary1150/merge_linear_cos0.9fmt0.1_MRL4096_ROLLOUT4_LR1e-6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, Zachary1150/merge_linear_cos0.9fmt0.1_MRL4096_ROLLOUT4_LR1e-6, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via the mergekit tool, combining two distinct pre-trained language models.

Merge Details

The merge process involved two base models:

/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR1e-6/global_step_50/actor/huggingface
/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/fmt_MRL4096_ROLLOUT4/global_step_50/actor/huggingface

These models were combined with specific weighting: the 'cos' model received a 0.9 weight, and the 'fmt' model received a 0.1 weight. The merge was configured to normalize parameters and utilize bfloat16 for its dtype.

Key Characteristics

As a product of a linear merge, this model aims to synthesize the capabilities of its parent models. The specific nature of the 'cos' and 'fmt' models (indicated by their paths) suggests an origin from research or experimental checkpoints, likely focusing on specific aspects of language modeling or reinforcement learning from human feedback (RLHF) given the 'actor' designation. Its 1.5B parameter count makes it suitable for applications requiring a balance between performance and computational efficiency.

Potential Use Cases

Given its merged nature, this model could be explored for general text generation, summarization, or question-answering tasks where the combined strengths of its base models are beneficial. Its relatively compact size allows for deployment in environments with moderate computational resources.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)