Model Overview
allenai/llama-3-tulu-2-70b is a 70 billion parameter language model developed by AllenAI, fine-tuned from the meta-llama/Meta-Llama-3-70B base model. It is designed to act as a helpful assistant, trained on a comprehensive mix of publicly available, synthetic, and human-created datasets. The training methodology is detailed in the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2" [https://arxiv.org/abs/2311.10702].
Key Capabilities and Performance
This model demonstrates strong performance across a range of benchmarks, making it suitable for diverse applications:
- General Instruction Following: Achieves an average score of 73.01 across various benchmarks, indicating robust performance as an assistant.
- Reasoning: Scores 0.752 on MMLU (5-shot) and 0.845 on GSM8k (8-shot cot), showcasing its reasoning abilities.
- Code Generation: Attains 0.861 on Codex HumanEval Pass@10, highlighting its proficiency in coding tasks.
- Truthfulness: Scores 0.646 on TruthfulQA %Info+True, indicating a good balance of informativeness and truthfulness.
Input Format
The model is trained to use a specific input format for optimal generation quality:
<|user|>
Your message here!
<|assistant|>
It is crucial to include a newline after <|assistant|> for best results.
Intended Uses
This model is primarily intended for use as a helpful assistant, capable of handling a wide array of conversational and instruction-based tasks. Its fine-tuning on a diverse dataset makes it adaptable to various general-purpose applications requiring strong language understanding and generation.