transformers-community/dola
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Aug 21, 2025Architecture:Transformer0.0K Warm
transformers-community/dola is an implementation of Decoding by Contrasting Layers (DoLa), a contrastive decoding strategy applied to the Qwen/Qwen3-0.6B base model. This 0.8 billion parameter model with a 32768 token context length enhances factuality and reduces hallucinations by contrasting logits from the final layer with those from earlier layers. It is particularly effective for short-answer tasks using higher layers and long-answer reasoning tasks using lower layers, making it suitable for improving output reliability in specific generative AI applications.
Loading preview...