TheBloke/robin-13B-v2-fp16
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jun 16, 2023License:otherArchitecture:Transformer0.0K Cold
TheBloke/robin-13B-v2-fp16 is a 13 billion parameter language model, based on OptimalScale's Robin 13B v2 architecture. This float16 PyTorch model is an unquantized version suitable for GPU inference and further conversions. It is designed to provide helpful, detailed, and polite responses, making it suitable for general conversational AI applications.
Loading preview...
OptimalScale's Robin 13B v2 fp16
This model is a 13 billion parameter language model, specifically the fp16 (float16) PyTorch version of OptimalScale's Robin 13B v2. It is provided by TheBloke, who converted and/or merged the original source repository to this format. This unquantized version is primarily intended for direct GPU inference and serves as a base for further model conversions.
Key Characteristics
- Architecture: Based on OptimalScale's Robin 13B v2.
- Parameter Count: 13 billion parameters.
- Format: Unquantized
fp16PyTorch format. - Context Length: Supports a context length of 4096 tokens.
- Prompt Template: Utilizes a chat-based prompt format, expecting
###Human:and###Assistant:turns.
Use Cases
- General Conversational AI: Designed to provide helpful, detailed, and polite answers to user prompts.
- GPU Inference: Suitable for direct deployment on GPUs due to its
fp16format. - Model Conversion Base: Can be used as a foundational model for creating other quantized versions (e.g., GPTQ, GGML) for different hardware or performance requirements.