Name: xiaolesu/Lean4-sft-tk-8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: xiaolesu

Model Overview

The xiaolesu/Lean4-sft-tk-8b is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B base model. It was developed using the Axolotl framework, incorporating several Liger optimizations such as liger_rope, liger_rms_norm, liger_glu_activation, liger_layer_norm, and liger_fused_linear_cross_entropy.

Key Training Details

Base Model: Qwen/Qwen3-8B
Dataset: Fine-tuned on the xiaolesu/lean4-sft-stmt-tk dataset, suggesting a specialization in Lean 4 related tasks.
Context Length: Configured for a sequence length of 8192 tokens, with flex_attention enabled.
Hyperparameters: Trained with a learning rate of 1e-05, using the AdamW_Torch_Fused optimizer, and a cosine learning rate scheduler with 53 warmup steps.
Frameworks: Utilizes Transformers 5.3.0, Pytorch 2.9.1+cu128, Datasets 4.5.0, and Tokenizers 0.22.2.

Potential Use Cases

Given its specific training dataset, this model is likely optimized for:

Assisting with Lean 4 theorem proving.
Generating or understanding Lean 4 code and formal statements.
Applications requiring specialized knowledge in formal verification or mathematical logic within the Lean 4 ecosystem.

Overview

Model Overview

Key Training Details

Potential Use Cases

Full Model Card (README)