KAT-2-33B-FT: DPO-Aligned Academic Tutor

KAT-2-33B-FT is a 32.8 billion parameter language model developed by Preston Mills of Progga AI, specifically fine-tuned for academic tutoring. Built upon the Qwen2ForCausalLM architecture and the progga-ai/KAT-2-33B-BASE model, it utilizes Direct Preference Optimization (DPO) to instill strong academic integrity and effective pedagogical behaviors. The model was trained on 42,610 preference pairs over 3 epochs, achieving an evaluation reward accuracy of 89.6%, a significant improvement over its base model.

Key Capabilities

Academic Integrity Enforcement: Refuses to complete graded work, instead offering hints and guidance.
Socratic Tutoring: Encourages students to attempt problems first before providing assistance.
Graduated Hints: Delivers progressively more detailed guidance based on student engagement and effort.
Misconception Diagnosis: Identifies and addresses specific conceptual gaps in student understanding.
High Context Length: Supports a 32,768 token context, allowing for extensive conversational history and complex problem-solving scenarios.

Ideal Use Cases

Educational Platforms: Integrating into online learning environments for personalized, integrity-focused academic support.
Student Support Systems: Providing AI-powered tutoring that guides students through learning challenges without giving direct answers.
Research in AI Ethics: Studying DPO alignment for enforcing ethical guidelines and specific behavioral constraints in LLMs, particularly in sensitive domains like education.

Overview

KAT-2-33B-FT: DPO-Aligned Academic Tutor

Key Capabilities

Ideal Use Cases

Full Model Card (README)