xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 5, 2026Architecture:Transformer Warm

The xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT model is an experimental checkpoint derived from the Qwen3-8B architecture, specifically the SFT+GRPO variant with 30% data overlap, developed by Xiaole Su, Kasey Zhang, and Andy Lyu. This model is designed for autoformalization tasks, exploring the impact of data overlap as a post-training hyperparameter. Its primary use case is research into improving the conversion of natural language to formal mathematical statements.

Loading preview...

Overview

The xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT is an experimental language model checkpoint, part of the research detailed in the paper "Data Overlap as a Post-Training Hyperparameter for Autoformalization" by Xiaole Su, Kasey Zhang, and Andy Lyu. This specific variant is built upon the Qwen3-8B architecture and represents the SFT+GRPO configuration with 30% data overlap, with its 'thinking' capability disabled.

Key Capabilities

  • Autoformalization Research: Primarily designed for experimental investigation into the process of converting natural language into formal mathematical or logical statements.
  • Data Overlap Study: Serves as a specific instance to study the effects of varying data overlap percentages during post-training on model performance in autoformalization tasks.
  • Qwen3-8B Base: Leverages the foundational capabilities of the Qwen3-8B model, adapted for this specific research context.

Good for

  • Academic Research: Ideal for researchers and practitioners exploring advanced techniques in autoformalization and the impact of training data characteristics.
  • Understanding Data Overlap: Useful for analyzing how different levels of data overlap influence the efficacy and robustness of models in complex reasoning tasks.
  • Experimental Development: Provides a specific, controlled checkpoint for further experimentation and fine-tuning within the domain of formal reasoning and AI.

For comprehensive details, results, and related artifacts, refer to the paper repository and the associated arXiv paper.