How can we early warn against an impending student drop out or an adverse health condition in near real-time? How can we leverage recent interventions such as tutoring or medicines to early warn more accurately? More challengingly, how do we learn to early warn from data that is peppered with such interventions? Early warnings are pivotal for avoiding long-term problems in healthcare, education, mechanical failures, cloud and disaster management. To effectively aid human decision making in these high-stakes contexts, interpretability of the method producing warnings is a key concern. We consider the problem of learning to interpretably early warn from labeled data tainted by interventions. Our contributions are: (1) Principles: We lay out three characteristics– dominance, precedence and intervention-awareness–of an ideal early warning system. (2) Algorithm: In line with these, we propose SmokeAlarm which learns from past labeled data containing interventions offline and can produce early warnings online. (3) Interpretability: SmokeAlarm learns state-based progression models in the presence and absence of interventions, which are “bi-inspectable” by the human decision maker. Extensive experiments on synthetic and real-world data demonstrate that SmokeAlarm outperforms baselines (by 16 - 38% in terms of AUC, with an average lead time of 6.1 hours before the onset of septic shock), while scaling linearly with data size and also leading to intuitive, interesting discoveries in practice.
Research areas