Vision

A brief guide to vision support in Featherless

Featherless supports sending images via the API for models of the Gemma and Mistral classes. You can send images using either URLs or base64-encoded data, making it easy to integrate vision capabilities into your applications.

Important Notes

For optimal parsing and performance, we recommend sending the text prompt first, followed by the images in your message content. If your use case requires images to appear before the text prompt, consider placing them in the system prompt instead. When working with multiple images, send each image as a separate entry in the content array to ensure proper processing.

Quick Start

Using Image URLs

The simplest way to send images is using publicly accessible URLs:

Python - Quick Start

from openai import OpenAI

client = OpenAI(
    api_key="your-featherless-api-key",
    base_url="https://api.featherless.ai/v1"
)

response = client.chat.completions.create(
    model="google/gemma-3-27b-it",# or any supported vision model
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What's in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/image.jpg"
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

Using Base64 Encoded Images

For locally stored images, you can send them using base64 encoding:

Base64 Encoded Images

import base64
from pathlib import Path
from openai import OpenAI

def encode_image_to_base64(image_path):
    """Encode a local image file to base64 string."""
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

client = OpenAI(
    api_key="your-featherless-api-key",
    base_url="https://api.featherless.ai/v1"
)

# Read and encode the image
image_path = "path/to/your/image.jpg"
base64_image = encode_image_to_base64(image_path)

# Determine the image format for the data URL
image_extension = Path(image_path).suffix.lower()
if image_extension == '.png':
    data_url = f"data:image/png;base64,{base64_image}"
elif image_extension in ['.jpg', '.jpeg']:
    data_url = f"data:image/jpeg;base64,{base64_image}"
elif image_extension == '.webp':
    data_url = f"data:image/webp;base64,{base64_image}"
else:
    data_url = f"data:image/jpeg;base64,{base64_image}"# Default to JPEG

response = client.chat.completions.create(
    model="google/gemma-3-27b-it",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Describe what you see in this image."
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": data_url
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

Supported Models

Vision capabilities are available for models in these classes:

Gemma models: All Gemma variants with vision support
Mistral models: Mistral models with vision capabilities

In the models catalog you can find vision models tagged with an eye

Best Practices

To achieve the best results with vision-enabled models, use clear, high-resolution images that provide good detail for analysis. Keep your image files under 20MB for optimal performance and faster processing times. JPEG and PNG formats are recommended for best compatibility across all supported models. When crafting your prompts, provide clear and specific instructions about what you want the model to analyze or identify in the images. Finally, always implement proper error handling in your applications to gracefully manage API errors, rate limits, and network issues.

Need Help?

If you encounter issues or have questions about vision support visit our Discord community

Last edited: Jun 10, 2025