Model Overview
bluejude10/Smoothie-Qwen3-8B-KR-Self-Driving-Legal-v3 is an experimental 8B parameter model, fine-tuned using QLoRA on a dnotitia/Smoothie-Qwen3-8B base. Its primary goal was to investigate if domain-specific fine-tuning could enhance performance in a Retrieval-Augmented Generation (RAG) pipeline for Korean self-driving legal regulations.
Key Findings & Capabilities
- Hypothesis Rejected: The core hypothesis that fine-tuned 8B models would outperform or match non-fine-tuned 14B models in legal interpretation was rejected. Fine-tuned models (43-60% accuracy) performed significantly worse than non-fine-tuned 8B/14B models (90% accuracy) in the RAG setup.
- RAG Context Bypass: Fine-tuning led to models prioritizing internal learned patterns over provided RAG context, reducing the effectiveness of RAG.
- Overfitting: The model exhibited severe overfitting to superficial patterns in the training data, leading to template overfitting, concept confusion, and affirmation inconsistencies, especially with questions similar to the training set.
- Positive Discovery: Embedding the Q&A dataset used for fine-tuning directly into the RAG vector database, alongside raw legal texts, significantly improved the performance of non-fine-tuned models (achieving 90% accuracy).
- CoT Fine-tuning: A subsequent CoT (Chain-of-Thought) fine-tuning attempt (v5) showed some improvement (+17%p) and unique accuracy on complex legal questions, but introduced new issues like generation loops and cross-contamination.
When to Use This Model
- Research & Experimentation: This model is explicitly for research and exploratory purposes, particularly for understanding the pitfalls of fine-tuning in RAG-based systems with limited, simple datasets.
- Not for Production: It is not recommended for critical applications like legal advice, self-driving system decision-making, or any safety-critical use cases due to its demonstrated lower accuracy and instability compared to non-fine-tuned alternatives.