LiberteEPFL/qwen3-1.7b-sft-bigchat-v2
LiberteEPFL/qwen3-1.7b-sft-bigchat-v2 is a 1.7 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B-Base by LiberteEPFL. This model has been trained using SFT (Supervised Fine-Tuning) with TRL, making it suitable for conversational AI and general text generation tasks. It supports a context length of 32768 tokens, offering robust performance for applications requiring extensive context understanding.
Loading preview...
Overview
LiberteEPFL/qwen3-1.7b-sft-bigchat-v2 is a 1.7 billion parameter language model, fine-tuned from the Qwen/Qwen3-1.7B-Base architecture. Developed by LiberteEPFL, this model leverages Supervised Fine-Tuning (SFT) using the TRL library to enhance its conversational capabilities.
Key Capabilities
- General Text Generation: Capable of generating coherent and contextually relevant text based on prompts.
- Conversational AI: Optimized through SFT for interactive dialogue and chat-based applications.
- Extended Context Understanding: Supports a substantial context length of 32768 tokens, allowing for processing and generating longer sequences of text.
Training Details
The model was trained using the TRL framework (version 1.3.0) with Transformers (4.57.0), Pytorch (2.8.0+cu128), Datasets (4.8.5), and Tokenizers (0.22.1). The training procedure involved Supervised Fine-Tuning, building upon the base Qwen3-1.7B model to specialize its responses.
Good For
- Developing chatbots and virtual assistants.
- Applications requiring text completion or generation with a focus on dialogue.
- Scenarios where a balance between model size and context handling is crucial.