A Context-Dependent Supervised Learning Approach to Sentiment Detection in Large Textual Databases
Authors
Albert Weichselbraun
Department of Information Systems and Operations, Vienna University of Economics and Business
Stefan Gindl
Department of New Media Technology, MODUL University Vienna
Arno Scharl
Department of New Media Technology, MODUL University Vienna
Keywords:
annotation, document enrichment, machine learning, natural language processing
Abstract
Sentiment detection automatically identifies emotions in textual data. The increasing amount of emotive documents available in corporate databases and on the World Wide Web calls for automated methods to process this important source of knowledge. Sentiment detection draws attention from researchers and practitioners alike - to enrich business intelligence applications, for example, or to measure the impact of customer reviews on purchasing decisions. Most sentiment detection approaches do not consider language ambiguity, despite the fact that one and the same sentiment term might differ in polarity depending on the context, in which a statement is made. To address this shortcoming, this paper introduces a novel method that uses Naive Bayes to identify ambiguous terms. A contextualized sentiment lexicon stores the polarity of these terms, together with a set of co-occurring context terms. A formal evaluation of the assigned polarities confirms that considering the usage context of ambiguous terms improves the accuracy of high-throughput sentiment detection methods. Such methods are a prerequisite for using sentiment as a metadata element in storage and distributed ?file-level intelligence applications, as well as in enterprise portals that provide a semantic repository of an organization's information assets.