InfiR-1B-Base: A Reasoning-Optimized Small Language Model
InfiR-1B-Base, developed by InfiX, is a 1 billion parameter base model continually pretrained from Meta's Llama-3.2-1B. It is designed to advance AI systems by enhancing reasoning capabilities within smaller model sizes, thereby reducing adoption barriers and addressing privacy concerns. The model's architecture includes a Llama-3.2-1B base with 32 layers, 32 heads, RoPE, GQA, and an extended 4k context window.
Key Capabilities & Performance
InfiR-1B-Base demonstrates strong performance in reasoning, mathematics, and code generation, outperforming its base model, Llama-3.2-1B, across several benchmarks:
- Mathematical Reasoning: Achieves 63.46 on GSM8K and 31.82 on MATH, significantly higher than Llama-3.2-1B's 8.11 and 3.42, respectively.
- Code Generation: Scores 37.80 on HumanEval and 53.40 on MBPP, surpassing Llama-3.2-1B's 17.68 and 49.0.
- General Reasoning: Attains 47.24 on MMLU.
Training Details
The model was pre-trained on 900 billion tokens, comprising 52% code and 48% high-quality web data (math, science, encyclopedic). An additional 40 billion tokens were used for annealing with extra math, code, and synthetic samples. Supervised fine-tuning (SFT) involved approximately 4 million samples from datasets like Infinity-Instruct, Orca-AgentInstruct-1M, NuminaMath, and ScaleQuest.
Limitations
As a base model, InfiR-1B-Base primarily performs text completion and may not follow complex instructions as effectively as its instruction-tuned counterpart (InfiR-1B-Instruct). It still exhibits performance gaps compared to larger 70B+ models on very hard reasoning tasks (e.g., OlympiadBench). The model inherits the Llama-3.2 tokenizer and pre-training distribution, which may reflect web biases, and its knowledge cutoff is mid-2023. Evaluation has focused on English benchmarks, with multilingual robustness not yet verified.