Name: IsmaelMousa/Qwen2.5-3B-Instruct-EngSaf-628K API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: IsmaelMousa

Model Overview

IsmaelMousa/Qwen2.5-3B-Instruct-EngSaf-628K is a specialized large language model, fine-tuned from the Qwen2.5-3B-Instruct architecture by Ismael Mousa. With 3.1 billion parameters and a 32K context length, this model is specifically designed for Automatic Essay Grading (AEG), focusing on short-answer responses.

Key Capabilities

Essay Grading: Evaluates student answers against reference answers and mark schemes.
Rationale Generation: Provides a textual rationale explaining the assigned score.
JSON Output: Designed to output scores and rationales in a structured JSON format.
Domain-Specific Training: Fine-tuned on the EngSAF-628K dataset, comprising short-answer responses from engineering examinations.

Performance Metrics

Evaluation on a held-out test set demonstrated the model's capabilities in both scoring and rationale generation:

Score F1: 0.6141
Score Accuracy: 0.6200
Score Cohen's Kappa (CKS): 0.4123
Rationale F1 (BERT-Score): 0.6438

When to Use This Model

This model is particularly well-suited for:

Automated Educational Assessment: Grading short-answer questions in academic settings.
Feedback Generation: Providing structured feedback to students based on their responses.
Research in AEG: As a baseline or component in further research on automatic essay grading systems.

Limitations

The model's evaluation revealed instances where it correctly identified key aspects of student answers but occasionally failed to align its scoring perfectly with rubric criteria. It is primarily trained on engineering examination data, which may affect performance on other domains.

Overview

Model Overview

Key Capabilities

Performance Metrics

When to Use This Model

Limitations

Full Model Card (README)