Overview

OLMo-1B-0724-hf is a 1 billion parameter open language model from the Allen Institute for AI (AI2), designed to foster scientific research in language models. This July 2024 release is an updated version of the original OLMo 1B, demonstrating a 4.4 point increase in HellaSwag scores and other evaluation improvements. It was trained on an enhanced version of the Dolma dataset (v1.7) and utilizes a two-stage training curriculum, contributing to its improved performance.

Key Capabilities

Improved Performance: Shows notable gains on benchmarks like HellaSwag compared to its predecessor, with an average score of 65.0 across various tasks.
Transparent Development: Released with all code, checkpoints, logs, and training details to enable reproducibility and scientific study.
Staged Training: Benefits from a two-stage training process, initially on the full Dolma 1.7 dataset, followed by an annealing phase on a higher-quality subset.
Hugging Face Integration: Directly compatible with Hugging Face Transformers from v4.40 onwards, supporting easy inference and fine-tuning.

Good For

Language Model Research: Ideal for researchers studying language model behavior, training methodologies, and scaling laws due to its open and transparent nature.
Fine-tuning: Provides multiple intermediate checkpoints for flexible fine-tuning on specific downstream tasks.
English NLP Tasks: Optimized for general English natural language processing applications.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)