thangvip/vwen-0.5
TEXT GENERATIONConcurrency Cost:1Model Size:0.6BQuant:BF16Ctx Length:32kPublished:Mar 27, 2024License:apache-2.0Architecture:Transformer Open Weights Loading
vwen-0.5 is a 0.6 billion parameter language model developed by thangvip, fine-tuned from the sail/Sailor-0.5B architecture. With a context length of 32768 tokens, it is designed for general language understanding and generation tasks. The model was trained with a learning rate of 0.0001 over one epoch, achieving a validation loss of 1.8915.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–