g34634/qwen2.5-3b-memory-summary-v1
The g34634/qwen2.5-3b-memory-summary-v1 is a 3.1 billion parameter Qwen2.5-3B-Instruct model fine-tuned for extracting structured memory states from multi-turn conversations. It generates a JSON object containing key facts, unresolved references, topic, turn count, and a conversation summary. This model is designed to preprocess dialogues for downstream components like routers, RAG systems, and other LLMs, optimizing conversational AI pipelines.
Loading preview...
Overview
This model, g34634/qwen2.5-3b-memory-summary-v1, is a fine-tuned version of the Qwen2.5-3B-Instruct base model. Its primary function is to act as a Memory State Generator within a multi-turn dialogue system. It processes conversational input and outputs a structured JSON object containing a memory_state and a memory_summary.
Key Capabilities
- Structured Memory Extraction: Generates a JSON output with
key_facts,unresolved_refs,topic, andturn_countfrom ongoing dialogues. - Conversation Summarization: Provides a concise, one-sentence
memory_summaryof the conversation so far. - Pipeline Preprocessing: Designed to run early in a dialogue pipeline, providing structured context for subsequent components like routers, RAG systems, or other LLMs.
- Fine-tuned Performance: Trained using SFT + LoRA on a combination of DialogSum and QMSum datasets, achieving a final validation loss of 0.693.
Use Cases & Limitations
Good for:
- Enhancing multi-turn dialogue systems by providing structured context.
- Improving routing and retrieval accuracy in conversational AI.
- Summarizing ongoing conversations for quick understanding.
Limitations:
turn_countextraction may be inaccurate depending on dialogue format.key_factscan sometimes be more abstract summaries than concrete facts.- Optimized for short to medium-length conversations, with a maximum sequence length of 512 tokens.