ENERGY-DRINK-LOVE/deepnoid_DPOv3
The ENERGY-DRINK-LOVE/deepnoid_DPOv3 is a 10.7 billion parameter language model, fine-tuned from Deepnoid/mergekit_v2. This model was trained using DPO (Direct Preference Optimization) with a learning rate of 5e-07 over 1 epoch. While specific capabilities and intended uses are not detailed, its DPO fine-tuning suggests an optimization for alignment with human preferences.
Loading preview...
Model Overview
ENERGY-DRINK-LOVE/deepnoid_DPOv3 is a 10.7 billion parameter language model, fine-tuned from the base model Deepnoid/mergekit_v2. This iteration, designated DPOv3, indicates a focus on Direct Preference Optimization during its training process.
Training Details
The model underwent a single training epoch with a learning rate of 5e-07. Key training hyperparameters include:
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Batch Size: A total training batch size of 48 (1 per device across 6 multi-GPU devices with 8 gradient accumulation steps)
- Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio
Key Characteristics
- Parameter Count: 10.7 billion parameters
- Context Length: 4096 tokens
- Fine-tuning Method: Direct Preference Optimization (DPO), suggesting an aim to align model outputs with human preferences or specific desired behaviors.
Intended Uses & Limitations
Specific intended uses, detailed capabilities, and known limitations are not explicitly provided in the model card. Users should conduct further evaluation to determine its suitability for particular applications, especially given the unspecified training dataset.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.