Laplaces-Red-Devils/fol-pretrain-malls-qwen2.5-3
Laplaces-Red-Devils/fol-pretrain-malls-qwen2.5-3 is a 3.1 billion parameter language model based on the Qwen2.5-3B-Instruct architecture. It underwent a Stage 1 pretraining using LoRA SFT on the MALLS dataset for 3 epochs. This model serves as an intermediate base for further Stage 2 fine-tuning on specific target datasets.
Loading preview...
Model Overview
Laplaces-Red-Devils/fol-pretrain-malls-qwen2.5-3 is an intermediate language model derived from the Qwen/Qwen2.5-3B-Instruct base model. It features 3.1 billion parameters and a context length of 32768 tokens.
Key Characteristics
- Base Model: Built upon the robust Qwen2.5-3B-Instruct architecture.
- Pretraining: Underwent a Stage 1 pretraining phase using LoRA SFT (Low-Rank Adaptation for Supervised Fine-Tuning).
- Dataset: This pretraining was conducted on the MALLS dataset, utilizing 5,000 samples over 3 epochs.
Intended Use
This model is specifically designed as a foundational intermediate model. Its primary purpose is to serve as a base for subsequent Stage 2 fine-tuning on more specific, target datasets. Developers can leverage this pre-trained version to further adapt it to their particular use cases, building upon the initial MALLS dataset training.