Name: xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-Overlap API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: xiaolesu

Model Overview

The xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-Overlap is an experimental model checkpoint developed by Xiaole Su, Kasey Zhang, and Andy Lyu. It is based on the Qwen3-8B architecture and is a specific variant used in research concerning "Data Overlap as a Post-Training Hyperparameter for Autoformalization." This particular version represents the SFT+GRPO with 100% overlap condition, serving as a control where the Generative Replay Policy Optimization (GRPO) process reuses the entire dataset from Supervised Fine-Tuning (SFT).

Key Characteristics

Experimental Checkpoint: Part of a research study on data overlap in autoformalization.
Base Model: Built upon the Qwen3-8B architecture.
Training Method: Combines Supervised Fine-Tuning (SFT) with Generative Replay Policy Optimization (GRPO).
Data Overlap: Features 100% data overlap, meaning GRPO reuses all SFT data.
Purpose: Acts as a control condition to analyze the effects of data overlap on model performance in autoformalization tasks.

Research Context

This model is directly associated with the paper "SFT-GRPO Data Overlap as a Post-Training Hyperparameter for Autoformalization" by Xiaole Su, Kasey Zhang, and Andy Lyu, available on arXiv. The paper's repository provides further details, results, and related artifacts for this experimental work.

Overview

Model Overview

Key Characteristics

Research Context

Full Model Card (README)