YiPz/llama3-8b-pokerbench-sft
YiPz/llama3-8b-pokerbench-sft is an 8 billion parameter Llama 3.1 Instruct model fine-tuned by YiPz specifically for poker decision-making. This model leverages the PokerBench dataset to generate optimal poker actions, such as fold, call, check, bet, or raise. It is designed to provide expert-level strategic responses within poker scenarios, making it highly specialized for game theory applications in poker.
Loading preview...
Model Overview
YiPz/llama3-8b-pokerbench-sft is a specialized 8 billion parameter language model, fine-tuned from the Meta-Llama-3.1-8B-Instruct base model. Its primary purpose is to act as an expert poker player, generating strategic decisions for various poker scenarios.
Key Capabilities
- Poker Decision-Making: Generates appropriate poker actions (fold, call, check, bet, raise) based on given game states.
- Specialized Training: Fine-tuned using the PokerBench dataset (RZ412/PokerBench) with LoRA for 5,000 steps, optimizing its performance for poker strategy.
- Structured Output: Provides actions within
<action></action>XML-like tags, ensuring consistent and parseable responses.
Use Cases
- Poker AI Development: Ideal for integrating advanced poker decision logic into AI agents or bots.
- Strategic Analysis: Can be used to analyze poker scenarios and understand optimal play.
- Educational Tools: Potentially useful in developing tools for learning poker strategy.
Technical Details
The model was trained with a batch size of 128 and a learning rate of 1e-6. Quantized GGUF versions are also available for efficient deployment with tools like llama.cpp and Ollama.