• Login
    View Item 
    •   Athenaeum Home
    • BioMed Central Open Access Articles
    • Open Access Articles by UGA Faculty
    • View Item
    •   Athenaeum Home
    • BioMed Central Open Access Articles
    • Open Access Articles by UGA Faculty
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Classification of genomic islands using decision trees and their ensemble algorithms

    Thumbnail
    View/Open
    1471-2164-11-S2-S1.xml (90.16Kb)
    1471-2164-11-S2-S1.pdf (494.8Kb)
    1471-2164-11-S2-S1-S1.PDF (33.97Kb)
    1471-2164-11-S2-S1-S2.PDF (20.56Kb)
    Date
    2010-11-02
    Author
    Che, Dongsheng
    Hockenbury, Cory
    Marmelstein, Robert
    Rasheed, Khaled
    Metadata
    Show full item record
    Abstract
    Abstract Background Genomic islands (GIs) are clusters of alien genes in some bacterial genomes, but not be seen in the genomes of other strains within the same genus. The detection of GIs is extremely important to the medical and environmental communities. Despite the discovery of the GI associated features, accurate detection of GIs is still far from satisfactory. Results In this paper, we combined multiple GI-associated features, and applied and compared various machine learning approaches to evaluate the classification accuracy of GIs datasets on three genera: Salmonella, Staphylococcus, Streptococcus, and their mixed dataset of all three genera. The experimental results have shown that, in general, the decision tree approach outperformed better than other machine learning methods according to five performance evaluation metrics. Using J48 decision trees as base classifiers, we further applied four ensemble algorithms, including adaBoost, bagging, multiboost and random forest, on the same datasets. We found that, overall, these ensemble classifiers could improve classification accuracy. Conclusions We conclude that decision trees based ensemble algorithms could accurately classify GIs and non-GIs, and recommend the use of these methods for the future GI data analysis. The software package for detecting GIs can be accessed at http://www.esu.edu/cpsc/che_lab/software/GIDetector/.
    URI
    http://dx.doi.org/10.1186/1471-2164-11-S2-S1
    http://hdl.handle.net/10724/19661
    Collections
    • Open Access Articles by UGA Faculty

    About Athenaeum | Contact Us | Send Feedback
     

     

    Browse

    All of AthenaeumCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    About Athenaeum | Contact Us | Send Feedback