Automatic Identification of Character Types from Film Dialogs
We study the detection of character types from fictional dialog texts such as screenplays. As approaches based on the analysis of utterances' linguistic properties are not sufficient to identify all fictional character types, we develop an integrative approach that complements linguistic analysis with interactive and communication characteristics, and show that it can improve the identification performance. The interactive characteristics of fictional characters are captured by the descriptive analysis of semantic graphs weighted by linguistic markers of expressivity and social role. For this approach, we introduce a new data set of action movie character types with their corresponding sequences of dialogs. The evaluation results demonstrate that the integrated approach outperforms baseline approaches on the presented data set. Comparative in-depth analysis of a single screenplay leads on to the discussion of possible limitations of this approach and to directions for future research.
Robust Feature Selection Technique using Rank Aggregation
Although feature selection is a well-developed research area, there is an ongoing need to develop methods to make classifiers more efficient. One important challenge is the lack of a universal feature selection technique which produces similar outcomes with all types of classifiers. This is because all feature selection techniques have individual statistical biases while classifiers exploit different statistical properties of data for evaluation. In numerous situations this can put researchers into dilemma as to which feature selection method and a classifiers to choose from a vast range of choices. In this paper, we propose a technique that aggregates the consensus properties of various feature selection methods to develop a more optimal solution. The ensemble nature of our technique makes it more robust across various classifiers. In other words, it is stable towards achieving similar and ideally higher classification accuracy across a wide variety of classifiers. We quantify this concept of robustness with a measure known as the Robustness Index (RI). We perform an extensive empirical evaluation of our technique on eight data sets with different dimensions including Arrythmia, Lung Cancer, Madelon, mfeat-fourier, internet-ads, Leukemia-3c and Embryonal Tumor and a real world data set namely Acute Myeloid Leukemia (AML). We demonstrate not only that our algorithm is more robust, but also that compared to other techniques our algorithm improves the classification accuracy by approximately 3-4% (in data set with less than 500 features) and by more than 5% (in data set with more than 500 features), across a wide range of classifiers.
Maintaining Engagement in Long-term Interventions with Relational Agents
We discuss issues in designing virtual humans for applications which require long-term voluntary use, and the problem of maintaining engagement with users over time. Concepts and theories related to engagement from a variety of disciplines are reviewed. We describe a platform for conducting studies into long-term interactions between humans and virtual agents, and present the results of two longitudinal randomized controlled experiments in which the effect of manipulations of agent behavior on user engagement was assessed.
An evaluation of machine learning techniques to predict the outcome of children treated for Hodgkin-Lymphoma on the AHOD0031 trial: A report from the Children's Oncology Group
In this manuscript we analyze a data set containing information on children with Hodgkin Lymphoma (HL) enrolled on a clinical trial. Treatments received and survival status were collected together with other covariates such as demographics and clinical measurements. Our main task is to explore the potential of machine learning (ML) algorithms in a survival analysis context in order to improve over the Cox Proportional Hazard (CoxPH) model. We discuss the weaknesses of the CoxPH model we would like to improve upon and then we introduce multiple algorithms, from well-established ones to state-of-the-art models, that solve these issues. We then compare every model according to the concordance index and the brier score. Finally, we produce a series of recommendations, based on our experience, for practitioners that would like to benefit from the recent advances in artificial intelligence.