formalmathatepfl/mistral-7B-v0.3-finetuned

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 24, 2026License:otherArchitecture:Transformer Cold

formalmathatepfl/mistral-7B-v0.3-finetuned is a 7 billion parameter language model developed by formalmathatepfl, fine-tuned from Mistral-7B-v0.3. This model is specifically fine-tuned on the lean_sft dataset, indicating a specialization in tasks related to formal mathematics or theorem proving. It achieves a validation loss of 0.0542, suggesting strong performance on its specialized dataset.

Loading preview...

Model Overview

This model, formalmathatepfl/mistral-7B-v0.3-finetuned, is a 7 billion parameter language model derived from mistralai/Mistral-7B-v0.3. It has undergone fine-tuning on the lean_sft dataset, which suggests a focus on formal mathematics or theorem proving tasks, likely within the Lean proof assistant ecosystem.

Training Details

The model was trained for 1.0 epoch with a learning rate of 5e-05, utilizing an AdamW optimizer and a cosine learning rate scheduler with a warmup ratio of 0.05. The training involved a total batch size of 32 across 8 GPUs, with a gradient accumulation of 2 steps. During training, the validation loss progressively decreased, reaching a final reported loss of 0.0542.

Performance

  • Validation Loss: Achieved a final validation loss of 0.0542, indicating effective learning on the specialized lean_sft dataset.

Potential Use Cases

Given its fine-tuning on the lean_sft dataset, this model is likely optimized for:

  • Assisting with formal mathematics proofs.
  • Generating or verifying code within proof assistants like Lean.
  • Tasks requiring precise logical reasoning and adherence to formal systems.