Sirawipa/tian-ft
TEXT GENERATIONConcurrency Cost:1Model Size:0.6BQuant:BF16Ctx Length:32kPublished:May 23, 2024License:apache-2.0Architecture:Transformer Open Weights Loading

Sirawipa/tian-ft is a 0.6 billion parameter language model, fine-tuned from the sail/Sailor-0.5B architecture. This model was trained for 10 epochs with a learning rate of 0.0002 and achieved a final validation loss of 0.3696. Its specific intended uses and primary differentiators are not detailed in the available documentation.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p