DDeduPModelv7: Educational Task Optimization
DDeduPModelv7 is an 8 billion parameter language model, fine-tuned from the unsloth/llama-3-8b-bnb-4bit base model. Developed by Fralet, this model is specifically engineered to address the complexities of educational tasks, with a particular focus on curriculum alignment in higher education for Kazakhstan and international applications.
Key Capabilities
- Educational Text Generation: Capable of generating relevant educational content.
- Multi-label Classification: Performs multi-label classification for educational contexts.
- Course-to-Outcome Mapping: Accurately maps university courses to program learning outcomes across various academic disciplines, including environmental science, pedagogy, pharmacology, ecology, IT, psychology, geodesy, art, linguistics, agriculture, geology, land management, and mathematics.
- Multilingual Support: Trained on a comprehensive dataset of 137,226 examples in Kazakh, Russian, and English.
Training & Methodology
The model was fine-tuned using Huggingface's TRL library in conjunction with Unsloth, which significantly accelerated the training process. The training utilized the Fralet/DDeduPDatasetv2, a specialized dataset designed for educational alignment tasks.
Intended Use Cases
- Curriculum Alignment: Automating the mapping of university courses to learning outcomes for syllabus development and accreditation.
- Tool Integration: Embedding into educational software to assist educators in identifying program gaps.
- Research Support: Aiding researchers by providing structured data on course-outcome relationships.
- Job Market Alignment: Supporting the alignment of educational programs with job market demands.
Limitations
This model is intended to support human expertise, not replace it. Users should review outputs for institutional compliance and be aware of potential biases stemming from the dataset's focus on specific disciplines or regions (e.g., Kazakhstani contexts). While multilingual, further fine-tuning may be required for optimal performance in other languages or domains.