comp4cls/comp4cls-4B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Aug 26, 2025Architecture:Transformer Warm

comp4cls/comp4cls-4B is a 4 billion parameter model developed by Chanuk Lim, designed for retrieval-augmented classification of scientific and technical documents. It utilizes entity-centric semantic compression to transform long documents into concise, task-focused representations, enabling a 4B-scale model to match or outperform 8B-14B models in classification accuracy. This model is optimized for efficient, low-latency document classification in production pipelines, particularly for papers, patents, and R&D reports.

Loading preview...

Comp4Cls: Semantic Compression for Enhanced Classification

Comp4Cls is a 4 billion parameter model developed by Chanuk Lim, focusing on retrieval-augmented classification (RAG) for scientific and technical documents. Its core innovation lies in entity-centric semantic compression, which converts lengthy texts into short, structured summaries while preserving critical discriminative signals.

Key Capabilities & Features

  • Efficient Classification: A 4B-scale model that achieves or surpasses the performance of 8B–14B models, especially in fine-grained categories, by operating on compressed texts.
  • Semantic Compression: Uses a two-stage prompting process (entity extraction → selective rewriting) to create concise summaries with an explicit compression ratio, reducing input tokens by ~50% on average.
  • RAG with Short Contexts: Operates on compressed texts for both queries and retrieved neighbors, mitigating "lost-in-the-middle" issues and allowing for broader top-k retrieval.
  • Scalability & Robustness: Evaluated on large, bilingual datasets including papers, patents, and R&D reports, demonstrating robustness across domains and hierarchical, multi-label taxonomies.
  • Production-Ready: Designed for low-latency and high-throughput deployment, with compressed outputs ready for standard vector databases and supporting downstream tasks like semantic search and TL;DR summarization.

Good For

  • Classifying scientific and technical documents (e.g., patents, research papers, R&D reports).
  • Reducing computational costs and latency in document classification pipelines.
  • Implementing efficient retrieval-augmented generation (RAG) systems where context length is a concern.
  • Building knowledge organization systems that require structured, compressed representations of documents.