jasong03/qwen3-1.7b-bilingual-amr-sft-v1
The jasong03/qwen3-1.7b-bilingual-amr-sft-v1 model is a 1.7 billion parameter language model, fine-tuned from the Qwen3-1.7B architecture. This model is specifically fine-tuned for bilingual AMR (Abstract Meaning Representation) tasks, indicating its specialization in semantic parsing for multiple languages. It is designed for applications requiring advanced natural language understanding and semantic representation across different linguistic contexts. Its 32768 token context length supports processing longer inputs for complex semantic analysis.
Loading preview...
Model Overview
The jasong03/qwen3-1.7b-bilingual-amr-sft-v1 is a 1.7 billion parameter language model, fine-tuned from the base Qwen/Qwen3-1.7B architecture. This model is specifically optimized for bilingual Abstract Meaning Representation (AMR) tasks through supervised fine-tuning (SFT).
Key Capabilities
- Bilingual AMR Parsing: Specialized in converting natural language sentences into Abstract Meaning Representations across multiple languages.
- Semantic Understanding: Designed to capture the semantic structure of sentences, making it suitable for advanced natural language understanding applications.
- Qwen3-1.7B Base: Leverages the foundational capabilities of the Qwen3-1.7B model, including its 32768 token context window.
Training Details
The model was trained with a learning rate of 2e-05, a batch size of 4, and a gradient accumulation of 4, resulting in an effective total batch size of 16. It utilized the AdamW_Torch_Fused optimizer with a cosine learning rate scheduler over 3 epochs. The training was conducted using Transformers 5.2.0, Pytorch 2.10.0+cu128, Datasets 4.5.0, and Tokenizers 0.22.2.
Intended Use Cases
This model is particularly well-suited for research and development in:
- Cross-lingual semantic parsing.
- Applications requiring deep semantic understanding of text in multiple languages.
- Tasks involving knowledge representation and extraction from natural language.