• Login
    View Item 
    •   Athenaeum Home
    • University of Georgia Theses and Dissertations
    • University of Georgia Theses and Dissertations
    • View Item
    •   Athenaeum Home
    • University of Georgia Theses and Dissertations
    • University of Georgia Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    A large scale study of edit patterns in Wikipedia and its applications to vandalism detection

    Thumbnail
    Date
    2012-12
    Author
    Sethi, Deepika
    Metadata
    Show full item record
    Abstract
    In recent years, Web 2.0 applications such as Wikipedia have transformed the landscape of the World Wide Web by elevating the end-users from being passive consumers of information to ones that actively participate in content creation, organization and propagation. Wikipedia is a free online encyclopedia where any user can edit information with minimal restriction. Recent studies indicate that a large fraction of Internet users rely on Wikipedia for their information needs. Thus, it is immensely important to ensure the quality and accuracy of information that is shared on Wikipedia. Ironically, the open-edit nature of Wikipedia has also made it susceptible to various kinds of vandalism attacks. In this thesis, we perform a large-scale study of the edit patterns of Wikipedia articles. The goal of this study is to identify meta-data characteristics that can help us distinguish between high-quality edits and potential vandalism attacks. Our study is unique in several different aspects. Firstly, we trace the history of edits of Wikipedia articles and study the stability of articles, their growth over time, and the nature of users who perform the edits. Secondly, we study the spatial distributions of the origin of the edits. Thirdly, we also study the commonality of content and commonality of users among various Wikipedia articles. Through this study, we show that various types of contextual attributes of edits such as co-occurrence probabilities of words, registration status of edit contributors, and geographical region of origin of edits have strong distinguishing capabilities with regards to vandalism.
    URI
    http://purl.galileo.usg.edu/uga_etd/sethi_deepika_201212_ms
    http://hdl.handle.net/10724/28599
    Collections
    • University of Georgia Theses and Dissertations

    About Athenaeum | Contact Us | Send Feedback
     

     

    Browse

    All of AthenaeumCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    About Athenaeum | Contact Us | Send Feedback