Name: ShenaoZhang/0.001_idpo_iter_1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ShenaoZhang

Overview

ShenaoZhang/0.001_idpo_iter_1 is a fine-tuned language model derived from the HuggingFaceH4/mistral-7b-sft-beta base model. It has undergone fine-tuning on the HuggingFaceH4/ultrafeedback_binarized dataset, suggesting an optimization for instruction-following and preference alignment tasks.

Training Details

The model was trained with the following key hyperparameters:

Learning Rate: 5e-07
Batch Size: 8 (train and eval)
Gradient Accumulation Steps: 2, leading to a total effective batch size of 128
Optimizer: Adam with standard betas and epsilon
LR Scheduler: Cosine type with a 0.1 warmup ratio
Epochs: 1

This training configuration indicates a focused fine-tuning process aimed at adapting the base model's behavior to specific instruction-based interactions. Further details on its intended uses, limitations, and performance metrics are currently not available in the provided documentation.

Overview

Overview

Training Details

Full Model Card (README)