Name: ank028/Llama-3.2-1B-Instruct-gsm8k-MGSM8K-sft1-slerp API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ank028

Model Overview

This model, ank028/Llama-3.2-1B-Instruct-gsm8k-MGSM8K-sft1-slerp, is a 1 billion parameter instruction-tuned language model built upon the Llama 3.2 architecture. It was developed by ank028 using the SLERP merge method to combine the strengths of two distinct base models.

Key Capabilities

Specialized Merge: Created by merging ank028/Llama-3.2-1B-Instruct-gsm8k and autoprogrammer/Llama-3.2-1B-Instruct-MGSM8K-sft1. This approach aims to leverage the individual strengths of each component model.
SLERP Method: Utilizes the Spherical Linear Interpolation (SLERP) merge method, which is known for smoothly combining model weights and preserving performance across different layers.
Targeted Enhancement: The merged models suggest a focus on improving performance in specific domains, likely related to mathematical reasoning and instruction following, given the names of the base models (gsm8k and MGSM8K).

Good For

Mathematical Reasoning: Potentially well-suited for tasks involving arithmetic, algebra, and other quantitative problem-solving, due to its lineage from models fine-tuned on mathematical datasets.
Instruction Following: Designed to respond effectively to instructions, making it useful for various NLP applications where precise command execution is required.
Resource-Constrained Environments: As a 1 billion parameter model, it offers a balance between capability and computational efficiency, making it suitable for deployment in environments with limited resources.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)