Extending Active Learning for Improved Long-Term Return On Investment of Learning Systems
Decision support problems like error prediction, fraud detection, information filtering, network intrusion detection, surveillance etc are important class of business problems. They are called decision support problems because the system is expected to assist in the human decision making by recommending which transactions to review and also assisting in the detailed review process. Data mining systems for such problems deal with building classifiers that are not working by themselves but are part of a larger interactive system with an expert in the loop (figure 1.1). Characteristics of these domains include skewed class distribution, lots of unlabeled data, concept/feature drift over time, biased sampling of labeled data and expensive domain experts. Such systems are expected to be deployed and to maintain their effectiveness over a long time. Learning in such interactive settings typically involves deploying a classifier to suggest relevant positive examples to human experts for review. The cost and availability of these experts makes labeling and review of these examples expensive. The goal of a machine learning system in these settings is to not only provide immediate benefit to the users, but also to continuously improve its future performance while minimizing the expert labeling/reviewing costs and increasing the overall effectiveness of these experts.
History
Date
2013-12-13Degree Type
- Dissertation
Thesis Department
- Language Technologies Institute
Degree Name
- Doctor of Philosophy (PhD)