Model Overview
The declare-lab/trustalign_qwen2.5_7b is a 7.6 billion parameter Qwen2.5 model developed by declare-lab, distinguished by its Trust-Align training methodology. This approach focuses on enhancing the trustworthiness of the LLM by ensuring responses are strictly grounded in provided documents. A key feature is its ability to refuse to answer when no supporting information can be found within the given context, thereby mitigating hallucination.
Key Capabilities
- Contextual Grounding: Provides answers exclusively based on pre-provided search results or documents.
- Citation Accuracy: Capable of citing sources properly, using a format like
[1][2][3] for multiple documents. - Refusal Mechanism: Explicitly states inability to answer if information is not present in the provided documents.
- Unbiased Tone: Designed to generate responses with an unbiased and journalistic tone.
Training Details
The model was trained using a combination of an instruction-following corpus with a standard next-token prediction objective and a preference dataset utilizing Direct Preference Optimization (DPO). The training data, declare-lab/trust_data, is publicly available on Hugging Face. The training was conducted on two NVIDIA A100 GPUs (40GB each).
Good For
- Applications requiring high factual accuracy and source attribution.
- Use cases where preventing hallucination and ungrounded responses is critical.
- Information retrieval systems that need to explicitly indicate when an answer cannot be found in the given context.