anonymous-dada/DialFactSum-ACU-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 25, 2026Architecture:Transformer Cold

DialFactSum-ACU-8B is an 8 billion parameter dialogue summarization model developed by anonymous-dada, based on the Qwen3 architecture with a 32K context length. It is optimized using ACU-driven Group Relative Policy Optimization (GRPO) to achieve state-of-the-art factual coverage and density in dialogue summarization. This model strategically expands summary length to capture more facts without sacrificing precision, outperforming standard SFT models and established baselines on factual metrics.

Loading preview...

DialFactSum-ACU-8B: Advanced Dialogue Summarization

DialFactSum-ACU-8B is an 8 billion parameter model developed by anonymous-dada, specifically engineered for high-quality dialogue summarization. It leverages a Qwen3 base and is fine-tuned using a novel ACU-driven Group Relative Policy Optimization (GRPO) framework, which significantly enhances factual coverage and density compared to traditional supervised fine-tuning (SFT) methods.

Key Capabilities & Differentiators

  • Strategic Token Reallocation: Unlike standard SFT models that often over-truncate, DialFactSum-ACU-8B learns to expand summary length strategically, capturing more facts while maintaining high precision.
  • State-of-the-Art Factual Performance: It consistently outperforms strong baselines like Ctrl-DiaSumm and MV-BART across all factual metrics on the RoSE benchmark (SAMSum subset), achieving superior ACU F1 (0.5685) and Normalized ACU (0.4635).
  • Mitigation of "Truncation Trap": The GRPO policy effectively resolves the SFT bottleneck where models converge to conservative summary lengths, enabling DialFactSum-ACU-8B to generate more comprehensive summaries (approx. 30 words vs. 18 words for SFT) without sacrificing quality.
  • Superior Factual Consistency: The bidirectional ACU reward function used in training effectively mitigates hallucinations and structural errors, ensuring high factual consistency.
  • Preservation of Linguistic Quality: Evaluation via UniEval shows improvements in Coherence (0.9507) and Relevance (0.9041) compared to its SFT predecessor, avoiding the common "alignment tax" associated with reinforcement learning.

Good For

  • Applications requiring highly factual and dense summaries of dialogues.
  • Use cases where balancing summary length with factual accuracy is critical.
  • Researchers and developers looking for advanced dialogue summarization models that overcome limitations of traditional SFT.