d-matrix/gemma-2b
TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kArchitecture:Transformer Warm

d-matrix/gemma-2b is a 2.6 billion parameter functional reference of the Gemma model, provided by d-Matrix. This model offers configurations including a baseline equivalent to the original Gemma-2B and a 'BASIC' version with linear algebraic operands quantized to MXINT8-64. It is designed for evaluating the impact of quantization and approximated kernel simulations on model performance.

Loading preview...