cs-552-2026-catma/general_knowledge_model

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:May 11, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The cs-552-2026-catma/general_knowledge_model is a 2 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B for the CS-552 Modern NLP course. This model specializes in closed-book multiple-choice factual and reasoning questions, optimized to provide concise answers in a LaTeX \boxed{} format. It demonstrates an ability to extract answers with high reliability and is designed for general knowledge benchmarks.

Loading preview...

Model Overview

The cs-552-2026-catma/general_knowledge_model is a 2 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B for the CS-552 Modern NLP course project. Its primary focus is on General Knowledge tasks, specifically answering closed-book multiple-choice factual and reasoning questions.

Key Capabilities

  • Closed-book factual question answering: Excels at retrieving factual information without external context.
  • Multiple-choice reasoning: Designed to process and answer questions presented in a multiple-choice format.
  • Structured answer extraction: Optimized to output final answers as a single option letter within a LaTeX \boxed{} expression, facilitating automated evaluation.
  • Concise responses: Configured with a non-thinking mode chat template to encourage brief and direct answers.

Training and Performance

The model was trained using Supervised Fine-Tuning (SFT) with LoRA on a processed General Knowledge dataset derived from MMLU-style examples. During local validation on a small sanity-check set, it achieved a 10/10 extraction rate and 6/10 accuracy. This model serves as an intermediate SFT baseline to establish a working pipeline for general knowledge tasks and verify extractable boxed answers, with performance potentially varying on broader or more complex factual reasoning challenges.

Good For

  • Applications requiring precise, extractable answers to multiple-choice factual questions.
  • Benchmarking general knowledge understanding in a closed-book setting.
  • Use cases where concise, structured output is preferred for automated processing.