Laplaces-Red-Devils/fol-pretrain-malls-qwen2.5-3

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:May 29, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

Laplaces-Red-Devils/fol-pretrain-malls-qwen2.5-3 is a 3.1 billion parameter language model based on the Qwen2.5-3B-Instruct architecture. It underwent a Stage 1 pretraining using LoRA SFT on the MALLS dataset for 3 epochs. This model serves as an intermediate base for further Stage 2 fine-tuning on specific target datasets.

Loading preview...

Model Overview

Laplaces-Red-Devils/fol-pretrain-malls-qwen2.5-3 is an intermediate language model derived from the Qwen/Qwen2.5-3B-Instruct base model. It features 3.1 billion parameters and a context length of 32768 tokens.

Key Characteristics

  • Base Model: Built upon the robust Qwen2.5-3B-Instruct architecture.
  • Pretraining: Underwent a Stage 1 pretraining phase using LoRA SFT (Low-Rank Adaptation for Supervised Fine-Tuning).
  • Dataset: This pretraining was conducted on the MALLS dataset, utilizing 5,000 samples over 3 epochs.

Intended Use

This model is specifically designed as a foundational intermediate model. Its primary purpose is to serve as a base for subsequent Stage 2 fine-tuning on more specific, target datasets. Developers can leverage this pre-trained version to further adapt it to their particular use cases, building upon the initial MALLS dataset training.