thangvip/qwen3-4b-vietnamese-legal-grpo
The thangvip/qwen3-4b-vietnamese-legal-grpo is a 4 billion parameter Qwen 3 model, fine-tuned by thangvip, specializing in Vietnamese legal reasoning. It utilizes Group Relative Policy Optimization (GRPO) to perform syllogistic reasoning, generating structured legal arguments with citations. This model is designed for tasks requiring precise legal analysis within the Vietnamese context, providing structured outputs for legal questions.
Loading preview...
Vietnamese Legal Reasoning Model (GRPO)
This model, developed by thangvip, is a 4 billion parameter Qwen 3-based language model specifically fine-tuned for Vietnamese legal reasoning. It leverages Group Relative Policy Optimization (GRPO) to excel at syllogistic reasoning, structuring legal arguments into major premise, minor premise, and conclusion.
Key Capabilities
- Syllogistic Reasoning: Generates structured legal arguments (Major Premise → Minor Premise → Conclusion).
- Vietnamese Legal Domain: Optimized for Vietnamese legal texts and question-answering.
- GRPO Optimization: Utilizes an advanced policy optimization method for enhanced reasoning.
- Citation Support: Provides legal citations within its responses.
- Structured Output: Delivers responses in an XML-like format for clear organization.
Training Details
The model was fine-tuned using GRPO on Vietnamese legal question-answering data, with a sophisticated reward system emphasizing correctness (35%), format compliance (20%), citation accuracy (15%), and reasoning quality (15%), alongside penalties for hallucination and excessive length.
Good For
- Legal Education: Teaching legal reasoning methodologies.
- Legal Research: Assisting with preliminary analysis of legal questions.
- Document Drafting: Generating structured legal arguments.
- Legal Consultation: Providing initial legal guidance (requires human review).
Note: This model is domain-specific to Vietnamese law and should not replace professional legal advice; all conclusions and citations require verification.