Overview
Model Overview
heizige/Qwen2.5-Social-3B-NB-Chat is a 3.1 billion parameter model, fine-tuned from Qwen2.5-3B-Instruct using LoRA (Low-Rank Adaptation) to specialize in social dialogue. The primary goal of this model is to overcome the "overly official" and "lacking human touch" responses often seen in general large language models, particularly in specific social contexts.
Key Capabilities & Features
- Specialized Social Dialogue: Optimized for high-emotional-intelligence conversations across five core social scenarios: Elder, Girlfriend, Mentor, Stranger, and Spouse.
- Persona Maintenance: Demonstrates strong ability to maintain specific character personas, including humor and context-appropriate responses.
- Instruction Tuning: Fine-tuned on a custom dataset of 2843 dialogues, with system prompts used to enforce specific personas and conversational styles.
- Efficient Training: Trained using FP16 precision and gradient checkpointing, allowing completion on 8GB GPU VRAM.
Performance Highlights
- Improved Social Interaction: Achieves significantly higher scores in human and AI-based evaluations compared to the base Qwen2.5-3B-Instruct model and other smaller models like deepseek-r1:1.5b and codegemma:5b in social dialogue tasks.
- Context Length: Supports a context length of 32768 tokens.
Ideal Use Cases
- Role-playing applications: Generating dynamic and persona-consistent dialogue for interactive experiences.
- Social chatbots: Creating more engaging and human-like conversational agents for specific social interactions.
- Content generation: Crafting nuanced and emotionally intelligent responses for various social communication needs.