Name: VLSP2025-LegalSML/qwen3-1.7b-legal-pretrain API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: VLSP2025-LegalSML

VLSP2025-LegalSML/qwen3-1.7b-legal-pretrain: Vietnamese Legal Domain Model

This model is a specialized 1.7 billion parameter language model, continually pretrained from the Qwen3-1.7B architecture by the VLSP 2025 LegalSLM Task Organizers. Its primary focus is on Vietnamese legal text understanding and legal question answering.

Key Capabilities & Training

Domain Specialization: Adapted specifically for the Vietnamese legal domain through extensive continual pretraining.
Training Data: Utilizes a curated corpus of approximately 144,000 Vietnamese legal texts, comprising:
- ~96,000 official legal documents (laws, decrees, circulars).
- ~48,000 legal news articles and commentary.
Base Architecture: Built upon Qwen/Qwen3-1.7B.
Context Length: Supports a maximum sequence length of 4096 during training, with a stated context length of 32768 tokens.
Training Method: Employed full-parameter fine-tuning for continual pretraining, without quantization or LoRA.

Intended Use & Limitations

Good for: Developers and researchers working on legal AI applications in Vietnamese, particularly for tasks requiring deep understanding of legal documents or answering legal queries.
License: Released for research purposes only under the scope of the VLSP 2025 Evaluation Campaign. Usage outside this competition must adhere to relevant licensing agreements.

This model offers a robust foundation for developing legal-specific NLP solutions within the Vietnamese context, leveraging a substantial and relevant dataset.

Overview

VLSP2025-LegalSML/qwen3-1.7b-legal-pretrain: Vietnamese Legal Domain Model

Key Capabilities & Training

Intended Use & Limitations

Full Model Card (README)