Between the hedges
MetadataShow full item record
Each year, publicly incorporated companies are required to file a Form 10-K with the United States Securities and Exchange Commission. These documents contain an enormous amount of natural language data and may offer insight into financial performance prediction. This thesis attempts to analyze two dimensions of language held within this data: sentiment and linguistic hedging. An experiment was conducted with 325 human annotators to manually score a subset of the sentiment words contained in a corpus of 106 10-K filings, and an inference engine identified instances of hedges having governance over these words in a dependency tree. Finally, this work proposes an algorithm for the automatic classification of sentences in the financial domain as speculative or non-speculative using the previously defined hedge cues.