andrewlngdn/dsl-debug-7b-sft-rl
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 12, 2026License:mitArchitecture:Transformer Open Weights Cold

andrewlngdn/dsl-debug-7b-sft-rl is a Qwen2.5-7B-Instruct fine-tuned model developed by Andrew Lngdn, specifically optimized for debugging programs within a custom dataflow DSL. This model utilizes a two-stage training process involving Supervised Fine-Tuning (SFT) followed by GRPO reinforcement learning. It demonstrates strong performance in identifying and correcting errors across standard, nonlocal, and intent-mismatch debugging scenarios. Its primary strength lies in interactive code debugging using a defined set of tools and turns.

Loading preview...