Ranking documents using degrees of separation analysis from a large dataset of semantic relationships
Howell, Matthew Raymond
MetadataShow full item record
In this paper, a ranking mechanism is presented that ranks documents based on their Semantic Association Similarity, which is defined as the close-ness (based on degrees of separation) of associations between the entities found in each document. A large semantic knowledge base with over 1.6 million entities and 24 million associations is used as the backend dataset for comparison. Multiple ranking techniques are evaluated and speed concerns are addressed. Bloom filters are used to improve ranking speed while introducing a small percentage of false positives. A real world example of spam page identification is investigated.
Showing items related by title, author, creator and subject.
Christian, Halaschek-Wiener (uga, 2004-08)The focus of contemporary Web information retrieval systems has been to provide efficient support for the querying and retrieval of relevant documents. More recently, information retrieval over semantic metadata extracted ...
Milnor, III, William Henry (uga, 2005-12)In most contemporary approaches to pattern discovery in graphs, either quantitative anomalies or frequency of substructure is used to measure the relevance of a pattern. In this thesis, we address the issue of discovering ...
Aleman Meza, Boanerges (uga, 2007-08)In today’s web search technologies, the link structure of the web plays a critical role. In this work, the goal is to use semantic relationships for ranking documents without relying on the existence of any specific structure ...