PIP: Pictorial Interpretable Prototype Learning for Time Series Classification
Time series classifiers are not only challenging to design, but they are also notoriously difficult to deploy for critical applications because end users may not understand or trust black-box models. Despite new efforts, explanations generated by other interpretable time series models are complicated for non-engineers to understand. The goal of PIP is to provide time series explanations that are tailored toward specific end users. To address the challenge, this paper introduces PIP, a novel deep learning architecture that jointly learns classification models and meaningful visual class prototypes. PIP allows users to train the model on their choice of class illustrations. Thus, PIP can create a user-friendly explanation by leaning on end-users definitions. We hypothesize that a pictorial description is an effective way to communicate a learned concept to non-expert users. Based on an end-user experiment with participants from multiple backgrounds, PIP offers an improved combination of accuracy and interpretability over baseline methods for time series classification.
Augmentation of Physician Assessments with Multi-Omics Enhances Predictability of Drug Response: A Case Study of Major Depressive Disorder
This work proposes a "" workflow to sequentially augment physician assessments of patients' symptoms and their socio-demographic measures with heterogeneous biological measures to accurately predict treatment outcomes using machine learning. Across many psychiatric illnesses, ranging from major depressive disorder to schizophrenia, symptom severity assessments are subjective and do not include biological measures, making predictability in eventual treatment outcomes a challenge. Using data from the Mayo Clinic PGRN-AMPS SSRI trial as a case study, this work demonstrates a significant improvement in the prediction accuracy for antidepressant treatment outcomes in patients with major depressive disorder from 35% to 80% individualized by patient, compared to using only a physician's assessment as the predictors. This improvement is achieved through an iterative overlay of biological measures, starting with metabolites (blood measures modulated by drug action) associated with symptom severity, and then adding in genes associated with metabolomic concentrations. Hence, therapeutic efficacy for a new patient can be assessed prior to treatment, using prediction models that take as inputs, selected biological measures and physician's assessments of depression severity. Of broader significance extending beyond psychiatry, the approach presented in this work can potentially be applied to predicting treatment outcomes for other medical conditions, such as migraine headaches or rheumatoid arthritis, for which patients are treated according to subject-reported assessments of symptom severity.
An Analysis Pipeline with Statistical and Visualization-Guided Knowledge Discovery for Michigan-Style Learning Classifier Systems
Michigan-style learning classifier systems (M-LCSs) represent an adaptive and powerful class of evolutionary algorithms which distribute the learned solution over a sizable population of rules. However their application to complex real world data mining problems, such as genetic association studies, has been limited. Traditional knowledge discovery strategies for M-LCS rule populations involve sorting and manual rule inspection. While this approach may be sufficient for simpler problems, the confounding influence of noise and the need to discriminate between predictive and non-predictive attributes calls for additional strategies. Additionally, tests of significance must be adapted to M-LCS analyses in order to make them a viable option within fields that require such analyses to assess confidence. In this work we introduce an M-LCS analysis pipeline that combines uniquely applied visualizations with objective statistical evaluation for the identification of predictive attributes, and reliable rule generalizations in noisy single-step data mining problems. This work considers an alternative paradigm for knowledge discovery in M-LCSs, shifting the focus from individual rules to a global, population-wide perspective. We demonstrate the efficacy of this pipeline applied to the identification of epistasis (i.e., attribute interaction) and heterogeneity in noisy simulated genetic association data.