Decision Trees are tree-like structures that depict different possible decisions that can be made and their outcome for a given problem.
Principal Author: Ekta Aggarwal
This page gives an overview about decision trees , their building blocks, and learning a decision tree.
Decision trees, as the name suggests are particularly useful in making decisions for problems which have historical data. This historical data contains the choices made and their outcomes. As this historical data is used to build a decision tree, it is of utmost importance that the data is clean and correct. After generating the decision tree, we can use it to make a decision for any new data set. Decision trees are also used in machine learning to classify the output of a given data input.
Decision trees are built upon nodes, decision rules and a representative form of depicting the decisions and their outcomes (arcs, squares and circles).
The content of this page is based on mainly Artificial Intelligence: Foundations of Computational Agents and some other sources which are specified as they are used.
Influence diagrams and Decision Analysis tools are related to Decision trees, as they are also useful in making decisions.
Consider a case where a person wants to decide if he wants to open a tattoo shop or a lemonade stand. To make this decision he needs to carefully weigh all expected profits and losses for each business, before he invests in it. At the end he will make a decision based on the expected net profit of the two businesses and will try to maximize it.
Situations similar to this can be found in every day lives of people, where we need to choose between different opportunities / tasks keeping in mind their consequences. An educated decision can be made based on the historical data related to the problem. This data can be used to build a decision tree and its branches can be traversed to reach an optimum decision.
An example of using historical data for making decision using decision tree is as follows: Here we have some historical data for Bob containing the days on which he was unwell and some factors that we think might have caused it. So, our task is to predict that given a new day 15, with weather conditions being rainy, high humidity and weak wind, whether or not Bob will be unwell.
To get the answer to this question, we need to build a decision tree. We consider all the attributes that are present in data i.e. Forecast, Humidity and Wind. We pick Forecast as the starting point of the tree and split it into three branches, each representing the three values that can be used for Forecast. Here we check if the subset of given data in each branch is pure or not. Pure data for this particular scenario means that either the data will have all entries which suggest Bob gets unwell or Bob doesn't get unwell. But if we see a mixture of these outcomes in data set, we branch out the tree until we get pure data sets. After getting pure data subsets, we will have a decision tree as shown in Figure:1.
Now, we return back to our main question, will Bob get unwell on Day 15? To get the answer we traverse the decision tree with branches: Forecast = Rain, Humidity = High and Wind = Weak, until we reach the leaf node which says that Bob will get unwell based on the historical data.
Decision trees consist of nodes as one of their building blocks. Nodes can be one of three types based on their placement in the tree. Nodes can be classified as follows:
One can read a decision tree starting from its root and then following its arcs and making decisions along the way. Filtering down the tree, eventually a leaf node is reached which shows the outcome of the decisions made. A decision tree can also be viewed as a if-then-else structure found in programming languages.
Example: 1 - Carrying forward the example of selecting a business of tattoo shop or the lemonade stand, we can build a decision tree to help us evaluate a more profitable venture. As shown in the decision tree in Figure: 2, lemonade stand and tattoo shop each has an equal chance of success and failure i.e. 50%. Going further, if it was expected that 100 dollars could be earned from tattoo shop and 90 dollars from lemonade stand in case of successful business. While a loss of 30 dollars and 10 dollars would be incurred in the case of tattoo shop and lemonade stand respectively. Based on this data, it can be seen that a tattoo shop will lead to net profit of 35 dollars based on the expected profit and loss, while lemonade stand generates 40 dollars net profit. Thus, in this case it will be profitable to run a lemonade stand.
Example: 2 - Another example of a decision tree shown in Figure:3 takes into account the decisions made by delivery robot in determining how it wants to deliver a package. There are some possible outcomes such as, the robot could be damaged while delivering or the task will be completed without any harm. The task for the robot is to go from point A to point B, and in order to achieve the same it face uncertain future events like trip or fall from the stairs or slippery surface. To reduce the damage that the robot will incur from falling, it can wear pads. The pads will help in creating a cushion for the robot, thus lessening the damage. But on the downside, as the pads have weight, they reduce the speed at which the robot walks. Here the robot can now make a choice of wearing pads or not and taking the longer or shorter path which will determine whether an accident occurs or not.
The above examples describe two very different scenarios in which the decision trees can be used.
Up till now we have seen what decision trees are and how they can be used to make decisions. In order to make decisions on the basis of a decision tree, we first need to construct one. But before we construct a decision tree we need to keep in mind what kind of tree we want. A number of decision trees can be generated based on the input features of the historical data. However, trees with minimum height are preferred i.e. trees with minimum number of nodes, as they ease the number of decisions that are to be taken.
Thus, a naive approach to construct a decision tree could be searching the entire state space for the smallest decision tree. The decision tree which fits all the data is a good candidate but the state space grows exponentially. Thus, we need to perform a greedy search with the goal of minimizing error. The algorithm for building a decision tree based on the approach described above is as follows (This algorithm is from ).
Decision tree is built starting from the root and works it way down till the leaf node. Input to the algorithm are a set of training examples, input conditions and target feature. The algorithm follows the recursive nature of the tree. It works by testing the stopping criteria at all levels, if the stopping criteria is not true, then it splits the branch further. Thus, recursively building sub trees of data sets.
Decision trees are important and widely applicable when informed decisions are to be made. Particularly they are helpful as:
Along with the advantages, decision trees also have some disadvantages, they are: