xiaolesu/Qwen3-8B-Herald-SFT

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 18, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The xiaolesu/Qwen3-8B-Herald-SFT is an 8 billion parameter QLoRA fine-tune of Qwen/Qwen3-8B, developed by xiaolesu / Osmosis AI. This causal language model specializes in natural-language-to-Lean-4 autoformalization, translating English mathematical statements into formal Lean 4 theorem stubs. It is specifically designed as a supervised cold-start checkpoint for reinforcement learning, generating structured JSON output for downstream GRPO trainers.

Loading preview...

Model Overview

The xiaolesu/Qwen3-8B-Herald-SFT is an 8 billion parameter QLoRA fine-tuned model based on Qwen/Qwen3-8B, developed by xiaolesu / Osmosis AI. Its core function is natural-language-to-Lean-4 autoformalization (NL→FL), converting natural language mathematical statements into formal Lean 4 theorem stubs. The model was trained on the FrenzyMath/Herald_statements dataset, comprising 50,000 examples.

Key Capabilities

  • Lean 4 Autoformalization: Translates English mathematical statements into Lean 4 theorem signatures, including necessary Mathlib imports and sorry placeholders for proofs.
  • Structured Output: Generates output in a specific JSON format (#### {"header": ..., "formal_statement": ...}) designed for easy parsing.
  • SFT Cold-Start: Serves as a supervised fine-tuning (SFT) checkpoint, intended to initialize a GRPO (Goal-Oriented Reinforcement Learning with Proof Objectives) reinforcement learning phase.

Intended Use Cases

  • Direct Use: Automatically generating a corpus of Lean 4 theorem stubs from informal mathematical texts, competition problems, or research papers.
  • Downstream Use: Providing a robust starting point for GRPO training, where a Lean 4 compiler can offer binary reward signals based on type-checking the generated formal statements.

Out-of-Scope

This model is not designed for:

  • Writing complete Lean 4 proofs (it only generates theorem stubs).
  • Informal mathematical reasoning or problem-solving in natural language.
  • Working with Lean 3 or other proof assistants (e.g., Coq, Isabelle).
  • General-purpose coding tasks.