W-61/mistral-7b-base-sft-hh-helpful-4xh200-batch-64

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 18, 2026Architecture:Transformer Cold

W-61/mistral-7b-base-sft-hh-helpful-4xh200-batch-64 is a 7 billion parameter language model fine-tuned from mistralai/Mistral-7B-v0.3. This model has been specifically fine-tuned on the Anthropic/hh-rlhf dataset, aiming to enhance helpfulness and alignment. With a context length of 4096 tokens, it is optimized for conversational AI and instruction-following tasks where helpful and harmless responses are critical.

Loading preview...

Model Overview

This model, W-61/mistral-7b-base-sft-hh-helpful-4xh200-batch-64, is a 7 billion parameter large language model. It is a fine-tuned variant of the mistralai/Mistral-7B-v0.3 base model.

Key Capabilities

  • Fine-tuned for Helpfulness: The model has undergone supervised fine-tuning (SFT) using the Anthropic/hh-rlhf dataset. This training focuses on generating responses that are helpful and aligned with human preferences.
  • Base Architecture: Leverages the efficient and performant Mistral-7B architecture, known for its strong performance relative to its size.
  • Context Window: Supports a context length of 4096 tokens, suitable for processing moderately long inputs and generating coherent, extended responses.

Training Details

The model was trained with a learning rate of 2e-05 over 1 epoch, utilizing a total batch size of 64 across 4 GPUs. The training process achieved a final validation loss of 0.7902.

Intended Use Cases

This model is particularly well-suited for applications requiring helpful and harmless conversational AI, such as:

  • Chatbots and virtual assistants where user assistance and safety are priorities.
  • Instruction-following tasks where the model needs to provide constructive and aligned outputs.
  • Generating helpful content based on user prompts.