CDLM-0.5B: Corrective Diffusion Language Model
CDLM-0.5B is a 0.5 billion parameter model developed by Shuibai12138, specifically designed for advanced code generation and correction. It is fine-tuned from fredzzp/open-dcoder-0.5B, a masked diffusion language model built on the Qwen2 architecture. The core innovation of CDLM-0.5B lies in its error-aware training methodology, which incorporates a mixture objective to explicitly supervise visible incorrect tokens during training.
Key Capabilities
- Error-Aware Code Generation: The model is trained to be highly aware of potential errors, leading to more robust and correct code outputs.
- Targeted Refinement: It can perform iterative refinement, allowing for precise correction of code snippets by focusing on identified incorrect tokens.
- Masked Diffusion Language Model (MDLM): Leverages a diffusion-based approach for generation, which contributes to its corrective abilities.
- Custom
diffusion_generate Method: Requires trust_remote_code=True for its unique generation process, enabling its specialized refinement features.
Good For
- Code Correction: Ideal for tasks requiring the identification and rectification of errors in existing code.
- Improved Code Quality: Enhances the accuracy and reliability of generated code by explicitly addressing potential mistakes.
- Research in Diffusion Models for Code: Provides a practical implementation of corrective diffusion language models for further study and development. For detailed insights into its training and methodology, refer to the Corrective Diffusion Language Models paper.