Healshsj/Dew-1.2B-safetensors
Healshsj/Dew-1.2B-safetensors is a 1.2 billion parameter language model, fine-tuned from LiquidAI/LFM2.5-1.2B-Thinking with a 32768 token context length. It was developed by Healshsj through a staged CPT, SFT, and DPO process, focusing on deep reasoning and honest uncertainty. The model demonstrates improved performance on STEM-related tasks like MMLU and ARC-Challenge compared to its base model. It is designed for applications requiring structured reasoning and question answering.
Loading preview...
Dew-1.2B: A Small Model for Deep Reasoning
Dew-1.2B is a 1.2 billion parameter language model developed by Healshsj, fine-tuned from the LiquidAI/LFM2.5-1.2B-Thinking base model. It features a substantial 32768 token context length, enabling it to process longer inputs and maintain context over extended interactions. The model's development focused on enhancing its reasoning capabilities while explicitly acknowledging uncertainty, as highlighted by its motto: "Small model. Deep reasoning. Honest uncertainty."
Training Methodology
The model underwent a rigorous three-phase training process:
- Phase 1 CPT (Continued Pre-Training): Utilized datasets like
peS2o,finemath-4plus, andopen-web-mathto build a strong mathematical and scientific foundation. - Phase 2 SFT (Supervised Fine-Tuning): Incorporated a diverse range of datasets including
NuminaMath-CoT,MathInstruct,CAMEL,GPQA,OpenMathInstruct-2,GenesisII,MegaScience,SciFact,HotpotQA,UltraChat-200k,SlimOrca-Dedup,Glaive/xLAM tool-calling,robot policy traces,Roman, and a subset ofLima. - Phase 3 DPO (Direct Preference Optimization): Applied
UltraFeedback binarizedto align the model with preferred responses and further refine its output quality.
Performance Benchmarks
Dew-1.2B shows notable improvements over its base model, LFM2.5-1.2B:
- MMLU STEM: Improved from 22.0% to 28.0%.
- ARC-Challenge: Improved from 35.4% to 38.9%.
Key Features and Use Cases
- Structured Reasoning: Employs a unique ChatML format with
<think>blocks, allowing the model to explicitly articulate its reasoning process before providing an answer. This makes it suitable for applications where transparency in decision-making is crucial. - Enhanced STEM Performance: The specialized training datasets and fine-tuning stages have equipped Dew-1.2B with improved capabilities for scientific, technical, engineering, and mathematical tasks.
- Question Answering: Its training on various QA datasets suggests strong performance in extracting and synthesizing information to answer complex questions.
This model is particularly well-suited for developers looking for a compact yet capable model for tasks requiring detailed reasoning, especially within scientific or technical domains, and where understanding the model's thought process is beneficial.