Show simple item record

dc.contributor.authorGanesan, Sivanesan
dc.date.accessioned2014-03-04T19:58:54Z
dc.date.available2014-03-04T19:58:54Z
dc.date.issued2011-05
dc.identifier.otherganesan_sivanesan_201105_ms
dc.identifier.urihttp://purl.galileo.usg.edu/uga_etd/ganesan_sivanesan_201105_ms
dc.identifier.urihttp://hdl.handle.net/10724/27128
dc.description.abstractThere are a number of learning methods that provide solutions to classification and regression problems, including Linear Regression, Decision Trees, KNN, and SVMs. These methods work well in many applications, but they are challenged for real world problems that are noisy, non-linear or high dimensional. Furthermore, missing data (e.g., missing historical features of companies in stock data), is not managed well by current approaches. We present an implementation of a hybrid learning system that combines an ensemble of decision trees (Random Forest) with of Linear Regression. Linear Regression (LR) is fast but not accurate because it assumes linearity, while Random Forests are not as fast as LR but have been shown to be accurate for high dimensional and large data sets. By combining these approaches we address the weaknesses of each approach and exploit their strengths both in terms of real time performance and accuracy. In this thesis, we evaluate a hybrid Random Forest and Linear Regression implementation called "Regression Leaf Forest", which is a forest of trees with regression leaves for supervised learning problems. The approach extends Random Forests by introducing Linear Regression learners at the leaf nodes of the trees for predicting functions. Our empirical analysis on both real and artificial data shows that the proposed algorithm requires less computation time for both large and high-dimensional datasets while providing comparable or better accuracy when compared to: Single Tree, a Single Linear Regression Tree, and Random Forest algorithms.
dc.languageeng
dc.publisheruga
dc.rightspublic
dc.subjectRandom Forest, Linear Regression
dc.titleRegression Leaf Forest
dc.title.alternativea fast and accurate learning method for large & high dimensional data sets
dc.typeThesis
dc.description.degreeMS
dc.description.departmentComputer Science
dc.description.majorComputer Science
dc.description.advisorMaria Hybinette
dc.description.committeeMaria Hybinette
dc.description.committeeEileen T. Kraemer
dc.description.committeeShelby Funk


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record