Name: Lyte/QuadConnect2.5-0.5B-v0.0.9b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Lyte

Overview

Lyte/QuadConnect2.5-0.5B-v0.0.9b is a specialized 0.5 billion parameter Small Language Model (SLM) designed to play Connect Four. Developed by Lyte, it is built on the Qwen 2.5 base model and trained using Group Relative Policy Optimization (GRPO) on the Lyte/ConnectFour-T10 dataset. This model is an early experimental version (v0.0.9b) with evolving reward functions.

Key Capabilities

Connect Four Strategy: The model is trained to identify winning moves, block opponent's potential wins, and control the center of the Connect Four board.
XML Response Format: It generates moves and reasoning in a structured XML format, detailing its thought process and chosen column.
Performance: Evaluation results show a peak accuracy of 14.03% in predicting correct moves on the validation split at a temperature of 0.8.

Training Details

The model's training data was derived from the Leon-LLM/Connect-Four-Datasets-Collection, filtered to include only games with 10 or fewer turns. Training was conducted using TRL's GRPO framework. The model's performance was evaluated across various temperature settings (0.6, 0.8, 1.0).

Use Cases

Connect Four AI: Ideal for integrating an AI player into Connect Four applications or simulations.
Reinforcement Learning Research: Useful as a case study for applying GRPO to game-playing agents with small language models.

Overview

Overview

Key Capabilities

Training Details

Use Cases

Full Model Card (README)