RefalMachine/ruadapt_solar_10.7_part1
RefalMachine/ruadapt_solar_10.7_part1 is a fine-tuned language model based on the SOLAR architecture, developed by RefalMachine. This model was trained for 1.0 epochs using specific hyperparameters including a learning rate of 2e-05 and a total batch size of 128. It achieved a validation loss of 2.3397 and an accuracy of 0.5164 on its evaluation set, indicating its performance in tasks related to its fine-tuning objective.
Loading preview...
Model Overview
RefalMachine/ruadapt_solar_10.7_part1 is a fine-tuned language model derived from the SOLAR architecture. It was developed by RefalMachine and represents a specialized adaptation of its base model, solar_darulm_unigram_proj_init_17_01_24.
Training Details
The model underwent a single epoch of fine-tuning with a learning rate of 2e-05, utilizing a distributed training setup across 16 devices. Key hyperparameters included a train_batch_size of 1, eval_batch_size of 1, and a gradient_accumulation_steps of 8, resulting in an effective total_train_batch_size of 128. The optimizer used was Adam with betas=(0.9, 0.95) and epsilon=1e-05, employing a linear learning rate scheduler and native AMP for mixed-precision training.
Performance Metrics
During evaluation, the model achieved a final validation loss of 2.3397 and an accuracy of 0.5164. The training process showed a gradual decrease in validation loss and an increase in accuracy over 40,500 steps, indicating progressive learning.
Intended Uses & Limitations
Specific intended uses and limitations are not detailed in the provided information. Developers should conduct further evaluation to determine suitability for particular applications. The model's performance metrics suggest it may be applicable for tasks aligned with its fine-tuning dataset, though the dataset itself is not specified.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.