jasong03/qwen3-1.7b-bilingual-amr-sft-v3
The jasong03/qwen3-1.7b-bilingual-amr-sft-v3 model is a 1.7 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B. This model has been specifically trained using Supervised Fine-Tuning (SFT) with the TRL framework. It is designed for general text generation tasks, leveraging its bilingual capabilities and a 32K context length. Its primary strength lies in generating coherent and contextually relevant responses based on user prompts.
Loading preview...
Model Overview
The jasong03/qwen3-1.7b-bilingual-amr-sft-v3 is a 1.7 billion parameter language model, building upon the base architecture of Qwen/Qwen3-1.7B. This model has undergone Supervised Fine-Tuning (SFT) using the TRL framework, indicating a focus on improving its ability to follow instructions and generate specific types of responses.
Key Capabilities
- Bilingual Text Generation: Inherits and potentially enhances the bilingual capabilities of its base model.
- Instruction Following: Fine-tuned with SFT, suggesting improved performance in generating responses aligned with given prompts.
- Context Handling: Features a substantial context length of 32,768 tokens, allowing for processing and generating longer, more complex texts.
Training Details
The model was trained using the TRL library (version 0.19.1) with Transformers (version 5.2.0), Pytorch (version 2.10.0), and Datasets (version 4.5.0). The training process was logged and can be visualized via Weights & Biases, providing transparency into its development.