shuoxing/llama3-8b-full-pretrain-wash-c4-2-1m-bs4
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Mar 27, 2026License:llama3Architecture:Transformer Cold

The shuoxing/llama3-8b-full-pretrain-wash-c4-2-1m-bs4 model is an 8 billion parameter language model, fine-tuned from shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-reproduce-bs8 on the c4_2_1m dataset. This model is a specialized iteration of the Llama 3 architecture, focusing on further pre-training with specific data. Its primary application is in tasks benefiting from continued pre-training on the C4 dataset.

Loading preview...