Model Overview
The jasong03/qwen3-1.7b-bilingual-amr-sft-v2 is a 2 billion parameter language model, fine-tuned from the base Qwen/Qwen3-1.7B architecture. This model has undergone Supervised Fine-Tuning (SFT) using the TRL framework, specifically targeting bilingual Abstract Meaning Representation (AMR) tasks.
Key Capabilities
- Bilingual AMR Processing: Specialized in understanding and generating Abstract Meaning Representations in a bilingual context.
- Qwen3-1.7B Foundation: Benefits from the robust architecture of the Qwen3-1.7B base model.
- Supervised Fine-Tuning: Enhanced for specific tasks through SFT, indicating improved performance on its target domain.
- Extended Context Length: Supports a context length of 32768 tokens, allowing for the processing of longer and more complex inputs.
Training Details
The model was trained using SFT, with the process monitored via Weights & Biases. The training utilized specific versions of key frameworks:
- TRL: 0.19.1
- Transformers: 5.2.0
- Pytorch: 2.10.0
- Datasets: 4.5.0
- Tokenizers: 0.22.2
When to Use This Model
This model is particularly well-suited for research and applications requiring:
- Analysis or generation of Abstract Meaning Representations.
- Tasks involving bilingual text processing.
- Scenarios where a compact yet capable model (2B parameters) with a long context window is beneficial for specialized linguistic tasks.