shuoxing/llama3-8b-full-pretrain-wash-c4-0-6m-sft-bs64
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Mar 27, 2026Architecture:Transformer Cold

The shuoxing/llama3-8b-full-pretrain-wash-c4-0-6m-sft-bs64 is an 8 billion parameter language model, based on the Llama 3 architecture, that was trained from scratch. This model was developed by shuoxing and underwent supervised fine-tuning (SFT) with a batch size of 64. While specific training data and intended uses are not detailed, its Llama 3 foundation suggests general language understanding and generation capabilities.

Loading preview...