Name: SII-GAIR-NLP/davinci-llm-model API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SII-GAIR-NLP

daVinci-LLM-3B: A Transparent Pretraining Research Model

daVinci-LLM-3B is a 3.09 billion parameter base language model developed by SII-GAIR-NLP, designed to advance the science of pretraining. Unlike many LLMs, this project emphasizes full transparency and reproducibility, releasing not only the final model weights but also detailed training trajectories, intermediate checkpoints, data processing decisions, and over 200 ablation studies. This allows researchers to deeply investigate data quality, mixture design, training dynamics, and evaluation validity.

Key Capabilities

Transparent Pretraining: All aspects of the pretraining pipeline, including data processing logic, mixtures, logs, and checkpoints, are publicly documented.
Data Darwinism Framework: Utilizes a systematic L0–L9 taxonomy for categorizing data processing depth.
Extensive Ablations: Includes over 200 controlled experiments, providing insights into both positive and negative training outcomes.
Strong General Performance: Achieves an overall average score of 51.72 across 19 benchmarks, matching or exceeding the performance of larger 7B-scale models like OLMo-3 7B.
Specialized Reasoning: Demonstrates strong performance in math (62.80) and code generation (55.99), surpassing comparable models.

Good For

Research in Pretraining Science: Ideal for studying data quality, training dynamics, and evaluation stability.
General Language Understanding: Capable of broad language tasks.
Math and Science Reasoning: Excels in complex mathematical and scientific problem-solving.
Code Generation: Strong performance in generating code across various languages.

This model is a base model and requires additional instruction-tuning and safety alignment for production deployment.

Overview

daVinci-LLM-3B: A Transparent Pretraining Research Model

Key Capabilities

Good For

Full Model Card (README)