SII-GAIR-NLP/davinci-llm-model
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 26, 2026Architecture:Transformer0.0K Warm

daVinci-LLM-3B, developed by SII-GAIR-NLP, is a 3.09 billion parameter decoder-only Transformer model from the Qwen2 family with a 4096 token context length. It is designed for transparent and reproducible pretraining science, offering extensive ablation studies and detailed documentation of its training trajectory. This base model excels in broad language understanding, math/science reasoning, and code generation, achieving performance comparable to larger 7B-scale models.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p