Name: wvnvwn/llama-2-13b-chat-hf-lr5e-5-resta-0.3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wvnvwn

Model Overview

The wvnvwn/llama-2-13b-chat-hf-lr5e-5-resta-0.3 is a 13 billion parameter language model derived from the Llama-2 architecture. It was created by wvnvwn through a linear merge of three distinct Llama-2-13b-chat-hf variants using the MergeKit tool.

Key Merge Details

This model is a composite of:

The foundational meta-llama/Llama-2-13b-chat-hf model.
wvnvwn/llama-2-13b-chat-hf-SSFT-lr5e-5, a version likely fine-tuned for specific supervised instruction following.
wvnvwn/llama-2-13b-chat-hf-lr5e-5-gsm8k-lr5e-5, a variant specifically fine-tuned for the GSM8K dataset, indicating an emphasis on mathematical reasoning capabilities.

Configuration

The merge utilized a float16 dtype and applied specific weights to each component model across all 40 layers. Notably, the base Llama-2-13b-chat-hf model was included with a negative weight, suggesting an attempt to subtract or de-emphasize certain characteristics of the base model while integrating the fine-tuned components.

Potential Use Cases

This merged model is potentially well-suited for applications requiring:

Enhanced chat capabilities due to its Llama-2-chat base.
Improved mathematical reasoning and problem-solving, benefiting from the GSM8K fine-tuning.
General-purpose instruction following, leveraging the SSFT-tuned component.

Overview

Model Overview

Key Merge Details

Configuration

Potential Use Cases

Full Model Card (README)