Name: TheBloke/robin-13B-v2-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

OptimalScale's Robin 13B v2 fp16

This model is a 13 billion parameter language model, specifically the fp16 (float16) PyTorch version of OptimalScale's Robin 13B v2. It is provided by TheBloke, who converted and/or merged the original source repository to this format. This unquantized version is primarily intended for direct GPU inference and serves as a base for further model conversions.

Key Characteristics

Architecture: Based on OptimalScale's Robin 13B v2.
Parameter Count: 13 billion parameters.
Format: Unquantized fp16 PyTorch format.
Context Length: Supports a context length of 4096 tokens.
Prompt Template: Utilizes a chat-based prompt format, expecting ###Human: and ###Assistant: turns.

Use Cases

General Conversational AI: Designed to provide helpful, detailed, and polite answers to user prompts.
GPU Inference: Suitable for direct deployment on GPUs due to its fp16 format.
Model Conversion Base: Can be used as a foundational model for creating other quantized versions (e.g., GPTQ, GGML) for different hardware or performance requirements.

Overview

OptimalScale's Robin 13B v2 fp16

Key Characteristics

Use Cases

Full Model Card (README)