Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Semi-Greedy Construction of Oblique-Split Decision Trees

Abstract

Classication and Regression Trees (CART) are a method of structured prediction widely used in machine learning. Favored for their robustness to non-linear relationships and interpretability (James, Witten, Hastie, & Tibshirani, 2013), they have seen continued application since their broad introduction in 1984 (Breiman, Friedman, Olshen, & Stone, 1984). The shortcomings of a single-tree model, especially overfitting and lack of competitiveness compared to other machine learning methods, have been overcome through advances in bagging, pruning, and boosting (James et al., 2013). Decision trees are fitted in a top-down, greedy fashion because non-greedy fitting is computationally intractable. While this leads to efficient computation, using top-down, greedy estimates can produce suboptimal models. To overcome this suboptimality I propose a semi-greedy method of decision tree construction. First, I reformulate the problem based on the work of Norouzi et. al. (2015)(Norouzi, Collins, Johnson, Fleet, & Kohli, 2015), and use an algorithm which optimizes the split functions at all levels of the tree jointly, based on a global objective. This algorithm is parameterized by a single scalar for which I propose a bound. Second, I propose a dimensionality reduction step for cases when the semi-greedy algorithm becomes intractable. In the five datasets tested (Breast Cancer Wisconsin, Banknotes Authentication, Haberman Survival Data, Blood Transfusion Data, Fertility Data), the algorithm produces higher accuracy than two existing greedy tree packages in R (rpart, tree). And, it is computationally tractable for low-width datasets (approximately 8 features or fewer). For higher order problems, I propose a preprocessing step which reduces the higher-width data to a more tractable subset. I show that a dimensionally reduced space, processed using the semi-greedy algorithm, outperforms the greedy methods (rpart, trees) trained with both full and reduced data sets.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View