pablo-tech/Llama-2-7B-bf16-sharded-7
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold
The pablo-tech/Llama-2-7B-bf16-sharded-7 model is a 7 billion parameter Llama 2 architecture, trained using AutoTrain. This model is sharded and utilizes bf16 precision, offering a standard context length of 4096 tokens. It is designed for general language generation tasks, leveraging the foundational capabilities of the Llama 2 series.
Loading preview...