Name: aitfSR4/ub-sr04-qwen3.5-4b-cpt2-sft-game API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: aitfSR4

Model Overview

The aitfSR4/ub-sr04-qwen3.5-4b-cpt2-sft-game is a 4.5 billion parameter Qwen3.5-based large language model (LLM) developed by aitfSR4. Its primary function is to generate game content specifically for the Sekolah Rakyat platform. This model is deployed as an OpenAI-compatible inference server, utilizing Unsloth for efficient operation, as the Qwen3 hybrid architecture is not compatible with vLLM.

Key Capabilities

Game Content Generation: Specialized in producing structured game content, likely in JSON format, as indicated by the API examples.
OpenAI-Compatible API: Provides a standard API interface for chat completions, making integration straightforward.
Efficient Inference: Leverages Unsloth for optimized performance, supporting bfloat16 and 4-bit quantization for various GPU VRAM configurations.
ChatML Template: Uses the ChatML format (<|im_start|> / <|im_end|>) for chat interactions.
Direct JSON Output: Designed to output game JSON directly, without additional thinking blocks or markdown fences.

Deployment and Usage

This model is intended for deployment on platforms like RunPod, with specific hardware requirements ranging from A100 40GB (bfloat16) to T4 16GB (4-bit). It requires minimal 20GB storage and 16GB CPU RAM. The provided server.py script facilitates easy setup and execution of the inference server. The maximum sequence length supported is 4096 tokens.

Good for

Developers building applications that require automated generation of game content for the Sekolah Rakyat platform.
Projects needing an OpenAI-compatible LLM endpoint for structured text generation.
Use cases where efficient deployment on GPUs with varying VRAM (e.g., L4, T4) is crucial.

Overview

Model Overview

Key Capabilities

Deployment and Usage

Good for

Full Model Card (README)