• Login
    View Item 
    •   Athenaeum Home
    • University of Georgia Theses and Dissertations
    • University of Georgia Theses and Dissertations
    • View Item
    •   Athenaeum Home
    • University of Georgia Theses and Dissertations
    • University of Georgia Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Regression Leaf Forest

    Thumbnail
    Date
    2011-05
    Author
    Ganesan, Sivanesan
    Metadata
    Show full item record
    Abstract
    There are a number of learning methods that provide solutions to classification and regression problems, including Linear Regression, Decision Trees, KNN, and SVMs. These methods work well in many applications, but they are challenged for real world problems that are noisy, non-linear or high dimensional. Furthermore, missing data (e.g., missing historical features of companies in stock data), is not managed well by current approaches. We present an implementation of a hybrid learning system that combines an ensemble of decision trees (Random Forest) with of Linear Regression. Linear Regression (LR) is fast but not accurate because it assumes linearity, while Random Forests are not as fast as LR but have been shown to be accurate for high dimensional and large data sets. By combining these approaches we address the weaknesses of each approach and exploit their strengths both in terms of real time performance and accuracy. In this thesis, we evaluate a hybrid Random Forest and Linear Regression implementation called "Regression Leaf Forest", which is a forest of trees with regression leaves for supervised learning problems. The approach extends Random Forests by introducing Linear Regression learners at the leaf nodes of the trees for predicting functions. Our empirical analysis on both real and artificial data shows that the proposed algorithm requires less computation time for both large and high-dimensional datasets while providing comparable or better accuracy when compared to: Single Tree, a Single Linear Regression Tree, and Random Forest algorithms.
    URI
    http://purl.galileo.usg.edu/uga_etd/ganesan_sivanesan_201105_ms
    http://hdl.handle.net/10724/27128
    Collections
    • University of Georgia Theses and Dissertations

    About Athenaeum | Contact Us | Send Feedback
     

     

    Browse

    All of AthenaeumCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    About Athenaeum | Contact Us | Send Feedback