Name: visproj/proofkit-distilled-qwen0.5b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: visproj

ProofKit Qwen 0.5B Distilled Model

This model, visproj/proofkit-distilled-qwen0.5b, is a 0.5 billion parameter variant of the Qwen2.5-0.5B-Instruct architecture. It was developed by visproj through a sequence-level data distillation process, where it was fine-tuned using completions from a larger gpt-oss-20b teacher model (visproj/proofkit-gpt-oss-20b-lora) over ProofKit's specific prompts.

Key Capabilities and Performance

Distilled Performance: Achieves evaluation scores comparable to its 20B teacher model on held-out ProofKit prompts, with an average score of 76.6 across a 3-judge panel (Claude Opus 4.7, GPT-5.5, Qwen-3B).
Efficiency: As a 0.5B parameter model, it offers a highly efficient solution for its specialized task, outperforming larger untuned base models and older controls.
Purpose-Built: Designed as a core component for the ProofKit application, a work-sample generator for job seekers.

Limitations and Usage

Prompt Format Dependency: This model is prompt-format-frozen; it is specifically trained on the exact prompt shapes used by ProofKit and will not perform optimally with reworded or free-form prompts.
Specialized Use: It is a purpose-built component for the ProofKit app, not intended as a general chat model.

This model represents a significant improvement over earlier versions, incorporating a fix for synthetic-data leakage through "faithfulness anchors" and "seeded per-example variation" to ensure more reliable and context-aware outputs.

Overview

ProofKit Qwen 0.5B Distilled Model

Key Capabilities and Performance

Limitations and Usage

Full Model Card (README)