Type
Text
Type
Dissertation
Advisor
Kuan, Pei Fen | Zhu, Wei | Wu, Song | Xiao, Keli.
Date
2017-05-01
Keywords
AUC, Imbalanced Classification, Model Ensemble, Random Forest, ROC, Tree Based Method | Statistics
Department
Department of Applied Mathematics and Statistics
Language
en_US
Source
This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.
Identifier
http://hdl.handle.net/11401/77355
Publisher
The Graduate School, Stony Brook University: Stony Brook, NY.
Format
application/pdf
Abstract
The imbalanced class problem in classification is highly relevant in many realistic scenarios such as the detection of a rare condition. One solution is to design specific algorithms incorporating the unbalanced classes in the training process of a classifier. In this dissertation, we propose a novel multi-class classification tree based on the area under the ROC curve (AUC) to resolve the imbalanced classification problem. This tree classifier aims to maximize the sum of AUC for all one versus all classifiers at the node attribute selection stage while balancing the performance of sensitivity and specificity of all one versus all classification at the node threshold selection stage. The ROC tree is extended to ROC random forest with suitable modifications. Furthermore, the volume under surface (VUS), the extension of AUC for multi-class classification, is discussed in this dissertation as well and used to measure the performance of classifiers. The simulation results show that this multi-class ROC tree/forest method is superior to the classic CART/random forest on severely imbalanced multi-class classification problems, while the ROC random forest performs equally well as the SMOTE random forest on imbalanced binary classification problems. The application on Boston housing data shows that the ROC random forest can also be used for model ensemble and it performs better than all the base models and other ensemble methods in this application. | 100 pages
Recommended Citation
Yan, Jiaju, "Multi-Class ROC Random Forest for Imbalanced Classification" (2017). Stony Brook Theses and Dissertations Collection, 2006-2020 (closed to submissions). 3175.
https://commons.library.stonybrook.edu/stony-brook-theses-and-dissertations-collection/3175