MetadataShow full item record
Natural language parsing is the process of analyzing an input sentence by determining its syntactic structure and representing that structure according to a given formal grammar. However, it is often difficult to parse sentences correctly since the nature of language is ambiguous and has many irregularities. If the word order is totally or partially free, the task of parsing becomes more challenging. The process of parsing can be based on hand-coded heuristic rules, probability, or a hybrid of both. KorPar, described in this dissertation, is a parser for Korean based on hand-coded heuristic rules, represented in unification-based dependency grammar and implemented in Prolog. The dependency grammar provides an efficient way to parse the free word order of Korean, while the unification-based features express complex grammatical facts without complicating the parsing algorithm and, as a result, the parser can be easily modified for grammar correction, implementation of probabilities for the grammar rules, and application to other languages. KorPar analyzes the structure of a Korean natural language sentence by representing it as a set of dependency pairs. Since Korean is a partially free-word-order language, KorPar accounts for restrictions on totally free order of the words in a sentence, recognizes subcategorization features, restricts the order of dependents for a single head, matches long-distance dependencies, and parses nouns that lack case markers. KorPar has been tested with 100 consecutive sentences (more than 2000 words) from articles in the Chosun Ilbo Newspaper. The F-score (harmonic mean of precision and recall rates) was 96.3%.