comp4cls/comp4cls-4B
comp4cls/comp4cls-4B is a 4 billion parameter model developed by Chanuk Lim, designed for retrieval-augmented classification of scientific and technical documents. It utilizes entity-centric semantic compression to transform long documents into concise, task-focused representations, enabling a 4B-scale model to match or outperform 8B-14B models in classification accuracy. This model is optimized for efficient, low-latency document classification in production pipelines, particularly for papers, patents, and R&D reports.
Loading preview...
Comp4Cls: Semantic Compression for Enhanced Classification
Comp4Cls is a 4 billion parameter model developed by Chanuk Lim, focusing on retrieval-augmented classification (RAG) for scientific and technical documents. Its core innovation lies in entity-centric semantic compression, which converts lengthy texts into short, structured summaries while preserving critical discriminative signals.
Key Capabilities & Features
- Efficient Classification: A 4B-scale model that achieves or surpasses the performance of 8B–14B models, especially in fine-grained categories, by operating on compressed texts.
- Semantic Compression: Uses a two-stage prompting process (entity extraction → selective rewriting) to create concise summaries with an explicit compression ratio, reducing input tokens by ~50% on average.
- RAG with Short Contexts: Operates on compressed texts for both queries and retrieved neighbors, mitigating "lost-in-the-middle" issues and allowing for broader
top-kretrieval. - Scalability & Robustness: Evaluated on large, bilingual datasets including papers, patents, and R&D reports, demonstrating robustness across domains and hierarchical, multi-label taxonomies.
- Production-Ready: Designed for low-latency and high-throughput deployment, with compressed outputs ready for standard vector databases and supporting downstream tasks like semantic search and TL;DR summarization.
Good For
- Classifying scientific and technical documents (e.g., patents, research papers, R&D reports).
- Reducing computational costs and latency in document classification pipelines.
- Implementing efficient retrieval-augmented generation (RAG) systems where context length is a concern.
- Building knowledge organization systems that require structured, compressed representations of documents.