prestonpai/KAT-2-33B-FT
prestonpai/KAT-2-33B-FT is a 32.8 billion parameter language model, based on the Qwen2ForCausalLM architecture, fine-tuned by Preston Mills of Progga AI using Direct Preference Optimization (DPO). With a 32,768 token context length, this model is specifically designed for academic tutoring, enforcing academic integrity by providing hints and guidance rather than direct answers. It excels at Socratic tutoring, graduated hints, and misconception diagnosis, achieving an 89.6% evaluation reward accuracy.
Loading preview...
KAT-2-33B-FT: DPO-Aligned Academic Tutor
KAT-2-33B-FT is a 32.8 billion parameter language model developed by Preston Mills of Progga AI, specifically fine-tuned for academic tutoring. Built upon the Qwen2ForCausalLM architecture and the progga-ai/KAT-2-33B-BASE model, it utilizes Direct Preference Optimization (DPO) to instill strong academic integrity and effective pedagogical behaviors. The model was trained on 42,610 preference pairs over 3 epochs, achieving an evaluation reward accuracy of 89.6%, a significant improvement over its base model.
Key Capabilities
- Academic Integrity Enforcement: Refuses to complete graded work, instead offering hints and guidance.
- Socratic Tutoring: Encourages students to attempt problems first before providing assistance.
- Graduated Hints: Delivers progressively more detailed guidance based on student engagement and effort.
- Misconception Diagnosis: Identifies and addresses specific conceptual gaps in student understanding.
- High Context Length: Supports a 32,768 token context, allowing for extensive conversational history and complex problem-solving scenarios.
Ideal Use Cases
- Educational Platforms: Integrating into online learning environments for personalized, integrity-focused academic support.
- Student Support Systems: Providing AI-powered tutoring that guides students through learning challenges without giving direct answers.
- Research in AI Ethics: Studying DPO alignment for enforcing ethical guidelines and specific behavioral constraints in LLMs, particularly in sensitive domains like education.