Name: bluejude10/Smoothie-Qwen3-8B-KR-Self-Driving-Legal-v3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: bluejude10

Model Overview

bluejude10/Smoothie-Qwen3-8B-KR-Self-Driving-Legal-v3 is an experimental 8B parameter model, fine-tuned using QLoRA on a dnotitia/Smoothie-Qwen3-8B base. Its primary goal was to investigate if domain-specific fine-tuning could enhance performance in a Retrieval-Augmented Generation (RAG) pipeline for Korean self-driving legal regulations.

Key Findings & Capabilities

Hypothesis Rejected: The core hypothesis that fine-tuned 8B models would outperform or match non-fine-tuned 14B models in legal interpretation was rejected. Fine-tuned models (43-60% accuracy) performed significantly worse than non-fine-tuned 8B/14B models (90% accuracy) in the RAG setup.
RAG Context Bypass: Fine-tuning led to models prioritizing internal learned patterns over provided RAG context, reducing the effectiveness of RAG.
Overfitting: The model exhibited severe overfitting to superficial patterns in the training data, leading to template overfitting, concept confusion, and affirmation inconsistencies, especially with questions similar to the training set.
Positive Discovery: Embedding the Q&A dataset used for fine-tuning directly into the RAG vector database, alongside raw legal texts, significantly improved the performance of non-fine-tuned models (achieving 90% accuracy).
CoT Fine-tuning: A subsequent CoT (Chain-of-Thought) fine-tuning attempt (v5) showed some improvement (+17%p) and unique accuracy on complex legal questions, but introduced new issues like generation loops and cross-contamination.

When to Use This Model

Research & Experimentation: This model is explicitly for research and exploratory purposes, particularly for understanding the pitfalls of fine-tuning in RAG-based systems with limited, simple datasets.
Not for Production: It is not recommended for critical applications like legal advice, self-driving system decision-making, or any safety-critical use cases due to its demonstrated lower accuracy and instability compared to non-fine-tuned alternatives.

Overview

Model Overview

Key Findings & Capabilities

When to Use This Model

Full Model Card (README)