RefalMachine/ruadapt_solar_10.7_part1

TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

RefalMachine/ruadapt_solar_10.7_part1 is a fine-tuned language model based on the SOLAR architecture, developed by RefalMachine. This model was trained for 1.0 epochs using specific hyperparameters including a learning rate of 2e-05 and a total batch size of 128. It achieved a validation loss of 2.3397 and an accuracy of 0.5164 on its evaluation set, indicating its performance in tasks related to its fine-tuning objective.

Loading preview...

Model Overview

RefalMachine/ruadapt_solar_10.7_part1 is a fine-tuned language model derived from the SOLAR architecture. It was developed by RefalMachine and represents a specialized adaptation of its base model, solar_darulm_unigram_proj_init_17_01_24.

Training Details

The model underwent a single epoch of fine-tuning with a learning rate of 2e-05, utilizing a distributed training setup across 16 devices. Key hyperparameters included a train_batch_size of 1, eval_batch_size of 1, and a gradient_accumulation_steps of 8, resulting in an effective total_train_batch_size of 128. The optimizer used was Adam with betas=(0.9, 0.95) and epsilon=1e-05, employing a linear learning rate scheduler and native AMP for mixed-precision training.

Performance Metrics

During evaluation, the model achieved a final validation loss of 2.3397 and an accuracy of 0.5164. The training process showed a gradual decrease in validation loss and an increase in accuracy over 40,500 steps, indicating progressive learning.

Intended Uses & Limitations

Specific intended uses and limitations are not detailed in the provided information. Developers should conduct further evaluation to determine suitability for particular applications. The model's performance metrics suggest it may be applicable for tasks aligned with its fine-tuning dataset, though the dataset itself is not specified.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p