mihai-777/evolai-tfm-1p5b-v5
mihai-777/evolai-tfm-1p5b-v5 is a 1.7 billion parameter causal language model from the Qwen3 series, pre-trained on 36 trillion tokens across 119 languages. It features an expanded, high-quality pre-training corpus with a rich mix of coding, STEM, reasoning, and multilingual data. The model incorporates architectural refinements like qk layernorm and a three-stage pre-training process, making it suitable for broad language modeling, general knowledge acquisition, and improved reasoning tasks with a 32,768 token context length.
Loading preview...
Qwen3-1.7B-Base Overview
mihai-777/evolai-tfm-1p5b-v5 is a 1.7 billion parameter causal language model, part of the Qwen3 series. This model builds upon the Qwen2.5 generation with significant advancements in its training data and architectural design. It was pre-trained on an extensive corpus of 36 trillion tokens covering 119 languages, featuring a diverse mix of high-quality data including coding, STEM, reasoning, and multilingual content.
Key Improvements & Features
- Expanded Pre-training Corpus: Utilizes a significantly larger and higher-quality dataset, tripling language coverage compared to Qwen2.5.
- Architectural Refinements: Incorporates advanced training techniques and architectural improvements, such as qk layernorm, enhancing stability and performance.
- Three-stage Pre-training: The training process is divided into three stages: initial broad language modeling, followed by a focus on reasoning skills (STEM, coding), and finally, long-context comprehension up to 32,768 tokens.
- Optimized Hyperparameter Tuning: Leverages comprehensive scaling law studies to systematically tune hyperparameters for improved training dynamics.
- Context Length: Supports a substantial context window of 32,768 tokens.
Use Cases
This model is well-suited for applications requiring:
- General language understanding and generation.
- Tasks involving STEM, coding, and logical reasoning.
- Processing and understanding long-form text due to its extended context length.
- Multilingual applications across 119 languages.