SII-GAIR-NLP/davinci-llm-model
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 26, 2026Architecture:Transformer0.0K Warm
daVinci-LLM-3B, developed by SII-GAIR-NLP, is a 3.09 billion parameter decoder-only Transformer model from the Qwen2 family with a 4096 token context length. It is designed for transparent and reproducible pretraining science, offering extensive ablation studies and detailed documentation of its training trajectory. This base model excels in broad language understanding, math/science reasoning, and code generation, achieving performance comparable to larger 7B-scale models.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–