ericflo/Llama-3.1-8B-ContinuedTraining
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Sep 5, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

The ericflo/Llama-3.1-8B-ContinuedTraining model is an 8 billion parameter language model developed by Eric Florenzano, based on the Meta-Llama-3.1-8B architecture with a 32768 token context length. It features a unique high-rank (128) adapter training approach, distinguishing it from typical low-rank fine-tuning, to enhance learning capacity and mitigate catastrophic forgetting. This model is primarily optimized for general text completion, instruction following, and Python-focused code generation, trained on a diverse blend of high-quality datasets including FineTome-100k, dclm-baseline-1.0-parquet, English Wikipedia, and Starcoder.

Loading preview...