Course:CPSC312-2016-Project2- Poker Revamped

From UBC Wiki

Title

Authors: Thomas Roche, Aaron Mishkin

What is the problem?

In project one we implemented a simple artificial agent that could play a reduced form of poker against a human player. Please see the previous project proposal for a description of reduced poker and our conclusions from the first project.

We propose to return to this simple form of poker for project two and reimplement it in Haskell, although with significant changes to both the game's rules and the agent that plays it. Reimplementing reduced poker in Haskell will provide an opportunity to closely examine the differences between logic and functional programming and will allow an informed judgement to be made on which paradigm is more suitable for constructing small and expressive representations of games. More importantly, returning to reduced poker will allow deficiencies in the rules of the game to be corrected and provides a platform for making key improvements to the artificial agent.

The results of project 1 demonstrated that reduced poker suffered as a platform because of its rules. The game provided no mechanism for the human player or the artificial agent to exchange a subset of the cards that they were dealt, such as in Five-Card Draw. A direct result of this was that good hands were very uncommon and the vast majority of hands in reduced poker were of the two-pair, or high-card "classes". Evaluating the performance of the intelligent agent over a large subset of poker hands was impossible as a result. It also made the game significantly less fun for humans to play. To correct this, a system for players to exchange a subset of their hands for new cards will be introduced into the game. The rules for betting will be adjusted accordingly, with players taking turns betting multiple times over the course of a hand. Multiple rounds of betting also allows the two players to be notified of eachother's bets, an important mechanic in real poker.

The changes to the rules of reduced poker proposed above radically change the nature of the game and the design of the artificial agent will need to be greatly improved to accommodate this. A introducing a card exchange system requires that the computer agent have some method of determining which cards in its hand it should exchange, and the improvements to the betting rules introduce a new layer of complexity to how the agent calculates its bets. It should be noted that the changes to the betting system allow the main way to "game" the agent in project one (betting a only single dollar each hand) to be corrected by an improved method for calculating bets.

In summary, the problem is to reimplement reduced poker in Haskell with improvements to both its rules and the artificial agent that plays it. We will call this new version reduced poker+. This will allow new conclusions to be drawn about both the suitability of Haskell compared to Prolog and the extent to which a simple and effective artificial agent that plays reduced poker+ can be developed.

What is the something extra?

Just as in the project one, the something extra is the attempt to develop an effective policy that an artificial agent may follow when playing reduced poker+ against a person. The improvements to reduced poker introduce many new areas of complexity that will have to be addressed in the artificial agent. The agent's policy should follow from the rules of reduced poker+ and must be computed efficiently enough to be used. More advanced formalisms that can be implemented with reasonable effort, such as Q-learning, are being considered for this new agent.

What did we learn from doing this?

What learned learned from building reduced poker+ can be divided into two main categories: the suitability of Haskell vs. Prolog for building reduced poker+; and implementing reduced poker+ as well as the artificial agent we designed to play it.

Haskell vs. Prolog

Building reduced poker in Prolog and then reduced poker+ in Haskell revealed many of the strengths and weaknesses of the two languages and, of the two, we claim that Haskell is more suitable for building small games and agents that can play them. Haskell does a much better job of supporting development than Prolog and has a much better system for doing IO tasks. Haskell has functionality for building custom data structures and data types, its control flow is explicit and easily manipulated, and its compilation time type-checking is deeply useful. In comparison, only provides data structures through function symbols, is built on a powerful, but non-obvious, system for handling the flow of control that cannot be directly manipulated, and has no type checking aside from pattern matching. Note that while function symbols can be used to create the same data structures as can be defined in Haskell, the process of doing so is less intuitive. In addition to all of the above, the Haskell's standard library is large and easier to use than Prolog's and providing powerful functions that ease building complex functions. Simply put, it is easier to write code in Haskell than it is to do so in Prolog and this makes it a more suitable language.

The software engineering features of Haskell were key to building reduced poker+. Development started with the identification of function types and key data structures and then proceeded to to implementation. Starting from this strong position was only possible because Haskell provides functionality for building data structures outside of application code. Contrasting this are Prolog's function symbols that are only defined in rules, meaning that data structures are most easily defined as code is written. Consequently, the data structures used in reduced poker underwent frequent and time consuming changes. This was compounded by the lack of type checking in Prolog. Unlike in the Prolog project, most bugs during the building of reduced poker+ were caught and fixed because of Haskell's static type checking and custom type definitions rather than manual testing. The fact that Haskell's design produces good software practices while Prolog requires that they be enforced against the structure of the language is a strong argument for using Haskell over Prolog.

Manipulating the flow of program control in Haskell is much easier than it is in Prolog. Haskell has many built in features for control flow, such as "if" statements and guards for functions, "when" and "unless" statements for operations inside "do" blocks, and pattern matching for automatically identifying what functions should be called. In comparison, Prolog only has pattern matching, although it is more complete and allows for pattern matching in rule bodies in as well as heads. However, this is not necessarily good, as many rules in Prolog are partially evaluated until a pattern mismatch is found, which leads to confusing control flow.

IO operations are massively better in Haskell. Based on "do" blocks and the IO type wrapper around other types, IO in Haskell is initially hard to grasp. However, "do" blocks are very useful once understood and provide and excellent mechanism for interacting with users through the terminal. In fact, the core procedures of reduced poker+ are "do" block's that manipulate the state of the game, either through a human player and the terminal, or a rule-based computer player. In comparison, IO in Prolog accomplished similarly to everything else: through predicates. This caused some of the worst issues that were encountered when building reduced poker, in which users were queried an unpredictable number of times because of Prolog's pattern matching in predicate bodies.

The conclusion that we derive from these comparisons between Prolog and Haskell is that Haskell was much more suited to building a small game, not because it is more powerful, but because it encourages better software practices by the nature of its design, it is more predictable, and it is easier to do IO in.

Reduced Poker+ and its Agent

Creating reduced poker+ required re-implementing core functionality that was first built for reduced poker as well as implementing new features.

Functions for creating hands, determining hand types, assigning unique scores to hands, and computing bets for the artificial agent were taken from the reduced poker and re-created in Haskell, albeit with significant improvements. For example, the system for creating players hands had to be overhauled to allow for the subsets of the cards within them to be exchanged partway through each round of poker. This was done by "dealing" cards from a deck, which was then carried through each round of poker as a part of the game state. New cards were then easily dealt from this deck during card exchanges, thus removing the possibility of duplicate cards. Overall, this part of development was relatively easy and very well suited to Haskell. Functions are natural way to characterize rules. Given a particular input, they always return the same output, which makes them ideal for codifying fixed elements of reduced poker+, like hand scores and types.

Building the game portion of reduced poker+, where the betting rounds and card exchanges actually occur, was a challenging element of the project. This was because it required learning to work Haskell's IO type. As said above, we believe IO is one of Haskell's key advantages over Prolog. However, grasping how to properly use it as part of a larger program was initially difficult. When trying to integrate IO code with "regular" Haskell functions, there was strong desire to simply "take" the value from within the IO type, something that is not possible in Haskell. Instead, the non-IO functions must be called within a "do" block. The syntax of "do" blocks and their usage was also a stumbling block. With this said, once we understood how to IO is accomplished in Haskell, completing the game portion of the project was very easy; it was much easier than building the rule functions for creating and evaluating poker hands.

Building the agent to play reduced poker was hardest part of the project. As in the implementation of reduced poker, we opted for a rule-based system.

Overall, the performance of the betting mechanism designed for the agent in project one was satisfactory and was replicated with a modification in reduced poker+. The formula for calculating bets was changed to prevent users from gaming the agent by betting a single dollar every round by adding a clause for extremely low user bets. If the user was found to have bet less than 5% of their money, then the computer agent only bets 5% of its available money.

As said in the something extra description, we intended to use reinforcement learning to teach the agent how to exchange its cards. However, this was not possible for reasons of scale. If a state is considered a hand of cards, and an action and subset of those cards to keep, then there are an extremely large number of states and available actions in those states. A hand can be any 5 card subset of the 52 playing cards in a deck. This is 2598960 states and 32 actions in each state. We considered reducing the number of states by ignoring the card suits, but there are two key flaws with this. Firstly, flush and partial flushes are completely ignored in such a characterization. Secondly, the number of states is not reduced by this measure to a workably small number. The incredible size of the state space, and the large number of actions in each state would have required that a reinforcement learning algorithm be run for much too long a time to achieve reasonable results. So, in recognition of this fact, we decided to use a rule based system. The rules that we settled on are as follows:

-- Straight Flush, Straight, Flush, Full House - Keep all cards. -- Four-of-a-Kind, Three-of-a-Kind, Two-of-a-Kind - Keep the cards in the set. -- Two Pairs - Keep the pairs.

-- HighCard - If a partial straight of at least 3 cards exists, keep those cards. -- - If a partial flush of at least 3 cards exists, keep those cards. -- - Otherwise, keep no cards.

This system works reasonably well and the computer agent can achieve excellent hands when using. However, this anecdotal evidence and a much larger analysis is needed to determine its real performance. At the moment, all that we can accurately say is that it has the seeming of a reasonably intelligent opponent from the player's perspective.