jasong03/qwen3-1.7b-bilingual-amr-sft-v2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Feb 20, 2026Architecture:Transformer Warm

The jasong03/qwen3-1.7b-bilingual-amr-sft-v2 model is a 2 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B using Supervised Fine-Tuning (SFT) with the TRL framework. This model is designed for bilingual applications, focusing on Abstract Meaning Representation (AMR) tasks. It leverages a 32768 token context length, making it suitable for processing longer sequences in its specialized domain.

Loading preview...

Model Overview

The jasong03/qwen3-1.7b-bilingual-amr-sft-v2 is a 2 billion parameter language model, fine-tuned from the base Qwen/Qwen3-1.7B architecture. This model has undergone Supervised Fine-Tuning (SFT) using the TRL framework, specifically targeting bilingual Abstract Meaning Representation (AMR) tasks.

Key Capabilities

  • Bilingual AMR Processing: Specialized in understanding and generating Abstract Meaning Representations in a bilingual context.
  • Qwen3-1.7B Foundation: Benefits from the robust architecture of the Qwen3-1.7B base model.
  • Supervised Fine-Tuning: Enhanced for specific tasks through SFT, indicating improved performance on its target domain.
  • Extended Context Length: Supports a context length of 32768 tokens, allowing for the processing of longer and more complex inputs.

Training Details

The model was trained using SFT, with the process monitored via Weights & Biases. The training utilized specific versions of key frameworks:

  • TRL: 0.19.1
  • Transformers: 5.2.0
  • Pytorch: 2.10.0
  • Datasets: 4.5.0
  • Tokenizers: 0.22.2

When to Use This Model

This model is particularly well-suited for research and applications requiring:

  • Analysis or generation of Abstract Meaning Representations.
  • Tasks involving bilingual text processing.
  • Scenarios where a compact yet capable model (2B parameters) with a long context window is beneficial for specialized linguistic tasks.