Name: cs-552-2026-databand/general_knowledge_model API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cs-552-2026-databand

Overview

The cs-552-2026-databand/general_knowledge_model is a specialized language model developed for the CS-552 Modern NLP Spring 2026 project. It is an SFT-only merged model, specifically fine-tuned for multiple-choice general knowledge questions. The model's primary function is to provide a single, concise, boxed answer (e.g., \boxed{A}) from a given set of choices (A through T).

Key Capabilities & Training

Specialized Answering: Designed to output only a boxed letter for multiple-choice questions, enforced by a custom chat template.
Supervised Fine-Tuning (SFT): Trained using LoRA SFT with a masked loss function, focusing on the final assistant boxed answer.
Diverse Training Data: Built from six general knowledge datasets, including Kaggle LLM Science, EduQG, EduAdapt, NCERT_MCQs, SciQ, and OpenBookQA, with balanced answer distributions.
Performance: Achieved 85.30% accuracy on its 2,000-example SFT validation set and 56.25% on MMLU Redux 2k, outperforming the baseline significantly.

Intended Use Cases

Automated Quiz/Test Answering: Ideal for systems requiring precise, single-choice answers to general knowledge questions.
Educational Tools: Can be integrated into platforms for evaluating understanding of factual information.
Knowledge Retrieval: Useful for applications where quick, definitive answers to multiple-choice queries are needed.

Limitations

The model is highly specialized for multiple-choice formats and may not perform optimally on open-ended or generative tasks.
A DPO experiment was conducted but ultimately not selected as it reduced external benchmark accuracy, indicating the SFT-only model is the most robust for its intended purpose.

Overview

Overview

Key Capabilities & Training

Intended Use Cases

Limitations

Full Model Card (README)