In my last post I have shared a rather simple python code to build a decision tree classifier to recognize a hand in a poker game. The simplicity of my solution stemmed from the fact that I added features that guided the tree model to learn accurate class representations. In particular, I added binary features to signal ‘four-of-a-kind’, ‘three-of-a-kind’, etc. I was able to add these features because I had a full domain knowledge. But what could I do if I did not have any idea of what these features should be?
Well, the authors of the data set have published several papers on this. One of them is Evolutionary Data Mining with Automatic Rule Generalization from 2002 where they introduced improvements made to the RAGA system capable of extracting predictive rules. They used evolutionary search algorithm (genetic algorithm and genetic programming) to synthesize new rules, select rules with best support and coverage, and apply a generalization step to the best rules. Essentially the recipe includes the following functionality:
- Synthesis of population of rules and pruning invalid rules.
- Fitness evaluation function.
- Rules cross-over and mutation.
- Rules generalization function.
The rules synthesis can be achieved with a combination of features on “=” and “!=”. The fitness evaluation can be a standard rules support and confidence metrics. The cross-over and mutation can be a step where single features are “AND”ed together to create longer rules. The rule-generalization step is the most complex. It would involve grouping commonly occurring item-sets into new symbols. For example, if we have several rules that include three ranks being the same, we could combine it into a single item-set:
After several generations, such symbols should evolve to more general rules (i.e. no reference to ‘7’, just R1=R2=R3). Here, the introduction of new symbols into the item set is the feature engineering step that I performed manually in my decision tree code.