jiinking/3_bitwise_MQA_llama_model

TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kArchitecture:Transformer Cold

Loading preview...