Name: grimjim/SauerHuatuoSkywork-o1-Llama-3.1-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: grimjim

Model Overview

The grimjim/SauerHuatuoSkywork-o1-Llama-3.1-8B is an 8 billion parameter language model built upon the Llama 3.1 architecture, featuring a 32768 token context length. It was created using the mergekit tool, specifically employing the SLERP merge method to combine two distinct Llama 3.1 8B models: grimjim/HuatuoSkywork-o1-Llama-3.1-8B and VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct.

Key Characteristics

Hybridized Reasoning: This model is an experimental merge designed to integrate the reasoning capabilities of the "o1" model with the strong performance of the "SauerkrautLM" model.
Benchmark Improvements: While IFEval scores were noted to be lower than the SauerkrautLM component, the merge generally improved other benchmark results, indicating a broader enhancement in capabilities.
Llama 3.1 Base: Leverages the foundational strengths of the Llama 3.1 series.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, the model achieved an Average score of 26.63%. Specific benchmark results include:

IFEval (0-Shot): 52.19%
BBH (3-Shot): 32.09%
MMLU-PRO (5-shot): 33.23%

Use Cases

This model is suitable for applications requiring a balance of general language understanding and improved reasoning, particularly where the combined strengths of its merged components are beneficial. Its 32K context window supports processing longer inputs.

Overview

Model Overview

Key Characteristics

Performance Benchmarks

Use Cases

Full Model Card (README)