AmberYifan/llama3-8b-full-pretrain-junk-tweet-1m-en
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer Cold

AmberYifan/llama3-8b-full-pretrain-junk-tweet-1m-en is an 8 billion parameter Llama 3 based causal language model, fine-tuned from Meta-Llama-3-8B-Instruct. This model was trained with a learning rate of 1e-05 and a cosine learning rate scheduler over 3 epochs. Its specific primary differentiator and intended use cases are not detailed in the available information.

Loading preview...