shuoxing/llama3-8b-full-pretrain-wash-c4-1-2m-sft-bs64
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Mar 27, 2026Architecture:Transformer Cold

The shuoxing/llama3-8b-full-pretrain-wash-c4-1-2m-sft-bs64 is an 8 billion parameter language model based on the Llama 3 architecture. This model was trained from scratch, indicating a foundational pre-training effort rather than a fine-tune of an existing model. While specific differentiators and intended uses are not detailed, its from-scratch training suggests a focus on establishing a robust base for further specialization. It is suitable for general language understanding and generation tasks where a Llama 3-based model of this size is appropriate.

Loading preview...