Name: martyn/llama2-megamerge-dare-13b-v1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: martyn

Model Overview

martyn/llama2-megamerge-dare-13b-v1 is a 13 billion parameter language model built upon the Llama 2 architecture. This model is a "megamerge" of nine distinct 13B models, including specialized variants for code, mathematics, and general instruction following. The merge was performed using specific hyperparameters (p=0.1 and lambda=2) via the safetensors-merge-supermario tool.

Key Capabilities

This merged model integrates the strengths of its constituent parts, which include:

Code Generation: Incorporates capabilities from ajibawa-2023/Code-13B and ajibawa-2023/Python-Code-13B.
Mathematical Reasoning: Benefits from the meta-math/MetaMath-13B-V1.0 component.
Instruction Following & Chat: Leverages models like migtissera/Synthia-13B, FPHam/Sydney_Overthinker_13b_HF, allenai/tulu-2-dpo-13b, Doctor-Shotgun/cat-v1.0-13b, and NeverSleep/Noromaid-13b-v0.1.1 for enhanced conversational and instruction-based performance.

Good For

This model is suitable for use cases that require a combination of:

Multi-domain problem-solving: Tasks that span coding, mathematical logic, and general language understanding.
Versatile AI applications: Where a single model needs to handle diverse types of prompts and instructions effectively.
Exploration of merged model performance: For developers interested in the synergistic effects of combining multiple specialized models.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)