shuoxing/llama3-8b-full-pretrain-wash-c4-2-4m-sft-bs64
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Mar 27, 2026Architecture:Transformer Cold
The shuoxing/llama3-8b-full-pretrain-wash-c4-2-4m-sft-bs64 model is an 8 billion parameter language model based on the Llama 3 architecture. This model was trained from scratch, undergoing a full pre-training process. While specific differentiators are not detailed, its training parameters suggest a focus on general language understanding and generation tasks.
Loading preview...