YiPz/llama3-8b-pokerbench-sft

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 8, 2026License:llama3Architecture:Transformer Cold

YiPz/llama3-8b-pokerbench-sft is an 8 billion parameter Llama 3.1 Instruct model fine-tuned by YiPz specifically for poker decision-making. This model leverages the PokerBench dataset to generate optimal poker actions, such as fold, call, check, bet, or raise. It is designed to provide expert-level strategic responses within poker scenarios, making it highly specialized for game theory applications in poker.

Loading preview...

Model Overview

YiPz/llama3-8b-pokerbench-sft is a specialized 8 billion parameter language model, fine-tuned from the Meta-Llama-3.1-8B-Instruct base model. Its primary purpose is to act as an expert poker player, generating strategic decisions for various poker scenarios.

Key Capabilities

  • Poker Decision-Making: Generates appropriate poker actions (fold, call, check, bet, raise) based on given game states.
  • Specialized Training: Fine-tuned using the PokerBench dataset (RZ412/PokerBench) with LoRA for 5,000 steps, optimizing its performance for poker strategy.
  • Structured Output: Provides actions within <action></action> XML-like tags, ensuring consistent and parseable responses.

Use Cases

  • Poker AI Development: Ideal for integrating advanced poker decision logic into AI agents or bots.
  • Strategic Analysis: Can be used to analyze poker scenarios and understand optimal play.
  • Educational Tools: Potentially useful in developing tools for learning poker strategy.

Technical Details

The model was trained with a batch size of 128 and a learning rate of 1e-6. Quantized GGUF versions are also available for efficient deployment with tools like llama.cpp and Ollama.