Abdullah-Taha/UTN-Qwen3-0.6B-LoRA-merged

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Apr 7, 2026License:mitArchitecture:Transformer Open Weights Warm

The Abdullah-Taha/UTN-Qwen3-0.6B-LoRA-merged model is a 0.8 billion parameter language model based on the Qwen3 architecture, specifically fine-tuned using LoRA on domain-specific data from the University of Technology Nuremberg (UTN). This model is optimized for providing helpful assistance and answering questions related to the UTN, making it suitable for specialized informational retrieval tasks. It is designed for direct inference without requiring PEFT libraries, offering a streamlined deployment for UTN-specific applications.

Loading preview...

Model Overview

Abdullah-Taha/UTN-Qwen3-0.6B-LoRA-merged is a specialized language model built upon the Qwen3-0.6B base architecture. It has been fine-tuned using the LoRA (Low-Rank Adaptation) method with specific parameters (r=64, alpha=128) and subsequently merged, allowing for direct inference without the need for PEFT libraries. This model is particularly tailored to the domain of the University of Technology Nuremberg (UTN).

Key Capabilities

  • Domain-Specific Expertise: Fine-tuned on 1,289 UTN Q&A pairs, enabling it to provide relevant and accurate information regarding the University of Technology Nuremberg.
  • Efficient Inference: The LoRA weights are merged into the base model, simplifying deployment and allowing for direct use with standard Hugging Face Transformers pipelines.
  • Compact Size: With 0.8 billion parameters, it offers a balance between performance and computational efficiency, suitable for applications where resource constraints are a consideration.

Training Details

The model was trained for 5 epochs with a learning rate of 3e-4 on an NVIDIA A40 GPU. Evaluation on a validation set of 17 examples showed ROUGE-1 score of 0.5924, ROUGE-2 of 0.4967, and ROUGE-L of 0.5687, indicating its ability to generate relevant and coherent responses within its specialized domain.

Good For

  • UTN-specific Q&A systems: Ideal for chatbots or virtual assistants designed to answer questions about the University of Technology Nuremberg.
  • Information retrieval: Can be used to extract or summarize information from UTN-related texts.
  • Specialized applications: Suitable for scenarios requiring a compact, domain-adapted language model focused on university-specific content.