JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback
JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback is a 4 billion parameter Qwen3-based language model fine-tuned by JoaoZaokk. This model is specifically optimized for code-related tasks, including Python code generation, explanation, and simple debugging. It was trained using QLoRA/LoRA on Python instruction and code feedback datasets, resulting in a merged safetensors model with a 32768 token context length, making it suitable for local experimentation with code-centric applications.
Loading preview...
Overview
JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback is a 4 billion parameter model built on the Qwen3 architecture, developed by JoaoZaokk. It is a merged fine-tune, specifically optimized for code-related tasks. The model was trained using QLoRA/LoRA on a base model, then merged into a full safetensors model, making it ready for direct use.
Key Capabilities & Training
- Code-focused Fine-tune: Specialized for Python code generation, explanation, and basic debugging.
- Training Data: Fine-tuned on 5,000 samples from
iamtarun/python_code_instructions_18k_alpacaand 5,000 samples fromm-a-p/CodeFeedback-Filtered-Instruction. - Architecture: Based on the Qwen3 family, with 4 billion parameters and a 32768 token context length.
- Training Method: Utilized QLoRA/LoRA with a rank of 16 and alpha of 32, targeting key projection layers.
Intended Use & Limitations
This model is designed for local experimentation in areas like Python code generation, code explanation, simple debugging, and instruction-following tests. It is also suitable for downstream conversion to formats like GGUF, AWQ, GPTQ, or OpenVINO. As an experimental model, users should be aware that it may produce incorrect code, unsafe suggestions, or hallucinated explanations, and outputs require review before production use.