AmberYifan/Qwen2-7B-sft-ultrachat-safeRLHF

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kArchitecture:Transformer Cold

AmberYifan/Qwen2-7B-sft-ultrachat-safeRLHF is a 7.6 billion parameter Qwen2-based language model fine-tuned by AmberYifan. This model is a safety-aligned instruction-tuned variant, building upon a prior SFT on Ultrachat data. It is designed for general conversational AI applications where safety and adherence to instructions are prioritized.

Loading preview...

Model Overview

AmberYifan/Qwen2-7B-sft-ultrachat-safeRLHF is a 7.6 billion parameter language model derived from the Qwen2 architecture. It represents a further fine-tuning of the AmberYifan/Qwen2-7B-sft-ultrachat model, specifically incorporating safety alignment through a process that likely involves Reinforcement Learning from Human Feedback (RLHF), as indicated by "safeRLHF" in its name. The initial SFT (Supervised Fine-Tuning) was performed on the Ultrachat dataset.

Key Capabilities

  • Instruction Following: Designed to respond accurately and appropriately to user instructions.
  • Safety Alignment: Incorporates safety measures to reduce harmful or undesirable outputs.
  • Conversational AI: Suitable for general-purpose dialogue and question-answering tasks.

Training Details

This model was trained using the TRL (Transformer Reinforcement Learning) framework, version 0.12.2. The training process involved Supervised Fine-Tuning (SFT) as a foundational step. The underlying framework versions include Transformers 4.46.3 and Pytorch 2.5.1+cu118.

Intended Use Cases

  • Developing chatbots requiring safe and instruction-tuned responses.
  • Applications where a balance of general knowledge and safety is crucial.
  • Prototyping conversational agents with a focus on controlled output generation.