sebsigma/SemanticCite-Refiner-Qwen3-1B
The SemanticCite-Refiner-Qwen3-1B is a 2 billion parameter causal language model developed by Sebastian Haan, fine-tuned from Qwen3-1.7B. This model specializes in preprocessing citation text by removing reference markers, author names, and publication identifiers. It converts author-centered statements to fact-centered statements, making it ideal for improving citation verification and standardizing academic text. With a context length of 40960 tokens, it focuses on cleaning and preparing citations for downstream analysis.
Loading preview...
Overview
The sebsigma/SemanticCite-Refiner-Qwen3-1B is a specialized 2 billion parameter causal language model, fine-tuned by Sebastian Haan from unsloth/Qwen3-1.7B-unsloth-bnb-4bit. Its primary function is to preprocess and refine citation text, making it suitable for academic verification systems.
Key Capabilities
- Citation Cleaning: Removes various reference markers such as bracketed numbers (e.g.,
[1]), author-year citations (e.g.,Smith 2020), andet al.notations. - Statement Transformation: Converts author-centered statements (e.g., "Smith (2020) found that...") into fact-centered, passive voice statements (e.g., "It was found that..."), while preserving all numerical values and factual details.
- Standardization: Aims to standardize citation statements for improved consistency and downstream processing.
Good For
- Academic Verification: Ideal as a first-stage component in pipelines designed to verify academic citations.
- Text Preprocessing: Useful for cleaning and preparing citation-rich documents for further analysis.
- Fact-Centric Conversion: Specifically designed for transforming text to focus on facts rather than attribution style.
Limitations
This model is not intended for general text summarization, legal or medical document processing, or creative content generation.