genne/nhn_dpo_v3_T3Q-ko-solar-dpo-v3.0_DPO
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer Open Weights Warm

The genne/nhn_dpo_v3_T3Q-ko-solar-dpo-v3.0_DPO model is a fine-tuned version of the chihoonlee10/T3Q-ko-solar-dpo-v3.0 model. It was trained using a learning rate of 5e-07 and a cosine learning rate scheduler over 1 epoch. This model is optimized for tasks related to its base model, though specific differentiators and intended uses require further information.

Loading preview...

Model Overview

This model, genne/nhn_dpo_v3_T3Q-ko-solar-dpo-v3.0_DPO, is a fine-tuned iteration of the chihoonlee10/T3Q-ko-solar-dpo-v3.0 base model. While specific details regarding its primary differentiators, intended uses, and the dataset used for fine-tuning are not provided in the current documentation, the training process involved specific hyperparameters.

Training Details

The fine-tuning was conducted with the following key parameters:

  • Learning Rate: 5e-07
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Batch Size: A train_batch_size of 1 and eval_batch_size of 8, leading to a total_train_batch_size and total_eval_batch_size of 48 due to gradient_accumulation_steps of 8 across 6 devices.
  • Epochs: The model was trained for 1 epoch.
  • Scheduler: A cosine learning rate scheduler with a warmup ratio of 0.1 was utilized.

Framework Versions

The training environment used:

  • Transformers 4.36.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.16.1
  • Tokenizers 0.15.1

Further information is needed to fully understand the model's specific capabilities and optimal use cases.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p