jiinking/7_bitwise_MQA_llama_model

TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kTool Calling:SupportedArchitecture:Transformer Cold

Loading preview...