Show simple item record

dc.contributor.authorHowell, Matthew Raymond
dc.date.accessioned2014-03-04T18:59:15Z
dc.date.available2014-03-04T18:59:15Z
dc.date.issued2010-12
dc.identifier.otherhowell_matthew_r_201012_ms
dc.identifier.urihttp://purl.galileo.usg.edu/uga_etd/howell_matthew_r_201012_ms
dc.identifier.urihttp://hdl.handle.net/10724/26915
dc.description.abstractIn this paper, a ranking mechanism is presented that ranks documents based on their Semantic Association Similarity, which is defined as the close-ness (based on degrees of separation) of associations between the entities found in each document. A large semantic knowledge base with over 1.6 million entities and 24 million associations is used as the backend dataset for comparison. Multiple ranking techniques are evaluated and speed concerns are addressed. Bloom filters are used to improve ranking speed while introducing a small percentage of false positives. A real world example of spam page identification is investigated.
dc.languageeng
dc.publisheruga
dc.rightspublic
dc.subjectSemantic Similarity
dc.subjectSemantic Associations
dc.subjectSemantic Relationships
dc.subjectDocument Comparison
dc.subjectDegrees of Separation
dc.subjectSemantic Web
dc.subjectBloom Filter
dc.subjectRelation Discovery
dc.subjectWeb Spam
dc.titleRanking documents using degrees of separation analysis from a large dataset of semantic relationships
dc.typeThesis
dc.description.degreeMS
dc.description.departmentComputer Science
dc.description.majorComputer Science
dc.description.advisorKang Li
dc.description.committeeKang Li
dc.description.committeeLakshmish Ramaswamy
dc.description.committeePrashant Doshi


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record