Name: allenai/Llama-3.1-Tulu-3-70B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: allenai

Overview

Llama-3.1-Tulu-3-70B is a 70 billion parameter instruction-following model from AllenAI, built upon Meta's Llama 3.1 base model. It is part of the Tülu 3 family, which emphasizes fully open-source data, code, and training recipes. The model is primarily English-language and is licensed under the Llama 3.1 Community License Agreement.

Key Capabilities

Instruction Following: Designed for state-of-the-art performance across a diversity of tasks, including general chat.
Mathematical Reasoning: Shows strong performance on benchmarks like MATH and GSM8K.
Instruction Following Evaluation (IFEval): Excels in complex instruction following scenarios.
Open-Source Approach: Provides a comprehensive post-training package with open-source data, code, and recipes.

Performance Highlights

On a range of benchmarks, the Tülu 3 70B model achieves an average score of 76.0, outperforming Llama 3.1 70B Instruct (73.4) and Qwen 2.5 72B Instruct (71.5). Notable scores include:

PopQA (15 shot): 46.5
BigBenchHard (3 shot, CoT): 82.0
MATH (4 shot CoT, Flex): 63.0
GSM8K (8 shot, CoT): 93.5
Safety (6 task avg.): 88.3

Usage Considerations

The model has limited safety training and does not include in-the-loop filtering, meaning it can produce problematic outputs. It is intended for research and educational use, and its fine-tuning involved datasets with outputs from third-party models, subject to their respective terms of use.