Project Achoo: A Practical Model and Application for COVID-19 Detection From Recordings of Breath, Voice, and Cough
The COVID-19 pandemic created significant interest and demand for infection detection and monitoring solutions. In this paper, we propose a machine learning method to quickly detect COVID-19 using audio recordings made on consumer devices. The approach combines signal processing and noise removal methods with an ensemble of fine-tuned deep learning networks and enables COVID detection on coughs. We have also developed and deployed a mobile application that uses a symptoms checker together with voice, breath, and cough signals to detect COVID-19 infection. The application showed robust performance on both openly sourced datasets and the noisy data collected during beta testing by the end users.
Detection of SARS-CoV-2 in COVID-19 Patient Nasal Swab Samples Using Signal Processing
This work presents an opto-electrical method that measures the viral nucleocapsid protein and anti-N antibody interactions to differentiate between SARS-CoV-2 negative and positive nasal swab samples. Upon light exposure of the patient nasal swab sample mixed with the anti-N antibody, charge transfer (CT) transitions within the altered protein folds are initiated between the charged amino acids side chain moieties and the peptide backbone that play the role of donor and acceptor groups. A Figure of Merit (FOM) was introduced to correlate the relative variations of the samples with and without antibody at two different voltages. Empirically, SARS-CoV-2 in patient nasal swab samples was detected within two minutes, if an extracted FOM threshold of >1 was achieved; otherwise, the sample wasconsidered negative.
Modeling Social Distancing and Quantifying Epidemic Disease Exposure in a Built Environment
As we transition away from pandemic-induced isolation and social distancing, there is a need to estimate the risk of exposure in built environments. We propose a novel metric to quantify social distancing and the potential risk of exposure to airborne diseases in an indoor setting, which scales with distance and the number of people present. The risk of exposure metric is designed to incorporate the dynamics of particle movement in an enclosed set of rooms for people at different immunity levels, susceptibility due to age, background infection rates, intrinsic individual risk factors (e.g., comorbidities), mask-wearing levels, the half-life of the virus and ventilation rate in the environment. The model parameters have been selected for COVID-19, although the modeling framework applies to other airborne diseases. The performance of the metric is tested using simulations of a real physical environment, combining models for walking, path length dynamics, and air-conditioning replacement action. We have also created a visualization tool to help identify high-risk areas in the built environment. The resulting software framework is being used to help with planning movement and scheduling in a clinical environment ahead of reopening of the facility, for deciding the maximum time within an environment that is safe for a given number of people, for air replacement settings on air-conditioning and heating systems, and for mask-wearing policies. The framework can also be used for identifying locations where foot traffic might create high-risk zones and for planning timetabled transitions of groups of people between activities in different spaces. Moreover, when coupled with individual-level location tracking (via radio-frequency tagging, for example), the exposure risk metric can be used in real-time to estimate the risk of exposure to the coronavirus or other airborne illnesses, and intervene through air-conditioning action modification, changes in timetabling of group activities, mask-wearing policies, or restricting the number of individuals entering a given room/space. All software are provided online under an open-source license.
Application of Tensor Decomposition to Gene Expression of Infection of Mouse Hepatitis Virus Can Identify Critical Human Genes and Efffective Drugs for SARS-CoV-2 Infection
To better understand the genes with altered expression caused by infection with the novel coronavirus strain SARS-CoV-2 causing COVID-19 infectious disease, a tensor decomposition (TD)-based unsupervised feature extraction (FE) approach was applied to a gene expression profile dataset of the mouse liver and spleen with experimental infection of mouse hepatitis virus, which is regarded as a suitable model of human coronavirus infection. TD-based unsupervised FE selected 134 altered genes, which were enriched in protein-protein interactions with orf1ab, polyprotein, and 3C-like protease that are well known to play critical roles in coronavirus infection, suggesting that these 134 genes can represent the coronavirus infectious process. We then selected compounds targeting the expression of the 134 selected genes based on a public domain database. The identified drug compounds were mainly related to known antiviral drugs, several of which were also included in those previously screened with an method to identify candidate drugs for treating COVID-19.
Adaptive constrained independent vector analysis: An effective solution for analysis of large-scale medical imaging data
There is a growing need for flexible methods for the analysis of large-scale functional magnetic resonance imaging (fMRI) data for the estimation of global signatures that summarize the population while preserving individual-specific traits. Independent vector analysis (IVA) is a data-driven method that jointly estimates global spatio-temporal patterns from multi-subject fMRI data, and effectively preserves subject variability. However, as we show, IVA performance is negatively affected when the number of datasets and components increases especially when there is low component correlation across the datasets. We study the problem and its relationship with respect to correlation across the datasets, and propose an effective method for addressing the issue by incorporating reference information of the estimation patterns into the formulation, as a guidance in high dimensional scenarios. Constrained IVA (cIVA) provides an efficient framework for incorporating references, however its performance depends on a user-defined constraint parameter, which enforces the association between the reference signals and estimation patterns to a fixed level. We propose adaptive cIVA (acIVA) that tunes the constraint parameter to allow flexible associations between the references and estimation patterns, and enables incorporating multiple reference signals, without enforcing inaccurate conditions. Our results indicate that acIVA can reliably estimate high-dimensional multivariate sources from large-scale simulated datasets, when compared with standard IVA. It also successfully extracts meaningful functional networks from a large-scale fMRI dataset for which standard IVA did not converge. The method also efficiently captures subject-specific information, which is demonstrated through observed gender differences in spectral power, higher spectral power in males at low frequencies and in females at high frequencies, within the motor, attention, visual and default mode networks.
A Domain Enriched Deep Learning Approach to Classify Atherosclerosis using Intravascular Ultrasound Imaging
Intravascular ultrasound (IVUS) imaging is widely used for diagnostic imaging in interventional cardiology. The detection and quantification of atherosclerosis from acquired images is typically performed manually by medical experts or by virtual histology IVUS (VH-IVUS) software. VH-IVUS analyzes backscattered radio frequency (RF) signals to provide a color-coded tissue map, and is the method of choice for assessing atherosclerotic plaque . However, a significant amount of tissue cannot be analyzed in reasonable time because the method can be applied just once per cardiac cycle. Furthermore, only hardware and software compatible with RF signal acquisition and processing may be used. We present an image-based tissue characterization method that can be applied to entire acquisition sequences for the assessment of diseased vessels. The pixel-based method utilizes domain knowledge of arterial pathology and physiology, and leverages technological advances of convolutional neural networks to segment diseased vessel walls into the same tissue classes as virtual histology using only grayscale IVUS images. The method was trained and tested on patches extracted from VH-IVUS images acquired from several patients, and achieved overall accuracy of 93.5% for all segmented tissue. Imposing physically-relevant spatial constraints driven by domain knowledge was key to achieving such strong performance. This enriched approach offers capabilities akin to VH-IVUS without the constraints of RF signals or limited once-per-cycle analysis, offering superior potential information acquisition speed, reduced hardware and software requirements, and more widespread applicability. Such an approach may well yield promise for future clinical and research applications.
J-MoDL: Joint Model-Based Deep Learning for Optimized Sampling and Reconstruction
Modern MRI schemes, which rely on compressed sensing or deep learning algorithms to recover MRI data from undersampled multichannel Fourier measurements, are widely used to reduce the scan time. The image quality of these approaches is heavily dependent on the sampling pattern. We introduce a continuous strategy to optimize the sampling pattern and the network parameters jointly. We use a multichannel forward model, consisting of a non-uniform Fourier transform with continuously defined sampling locations, to realize the data consistency block within a model-based deep learning image reconstruction scheme. This approach facilitates the joint and continuous optimization of the sampling pattern and the CNN parameters to improve image quality. We observe that the joint optimization of the sampling patterns and the reconstruction module significantly improves the performance of most deep learning reconstruction algorithms. The source code is available at https://github.com/hkaggarwal/J-MoDL.
Dense Recurrent Neural Networks for Accelerated MRI: History-Cognizant Unrolling of Optimization Algorithms
Inverse problems for accelerated MRI typically incorporate domain-specific knowledge about the forward encoding operator in a regularized reconstruction framework. Recently physics-driven deep learning (DL) methods have been proposed to use neural networks for data-driven regularization. These methods unroll iterative optimization algorithms to solve the inverse problem objective function, by alternating between domain-specific data consistency and data-driven regularization via neural networks. The whole unrolled network is then trained end-to-end to learn the parameters of the network. Due to simplicity of data consistency updates with gradient descent steps, proximal gradient descent (PGD) is a common approach to unroll physics-driven DL reconstruction methods. However, PGD methods have slow convergence rates, necessitating a higher number of unrolled iterations, leading to memory issues in training and slower reconstruction times in testing. Inspired by efficient variants of PGD methods that use a history of the previous iterates, we propose a history-cognizant unrolling of the optimization algorithm with dense connections across iterations for improved performance. In our approach, the gradient descent steps are calculated at a trainable combination of the outputs of all the previous regularization units. We also apply this idea to unrolling variable splitting methods with quadratic relaxation. Our results in reconstruction of the fastMRI knee dataset show that the proposed history-cognizant approach reduces residual aliasing artifacts compared to its conventional unrolled counterpart without requiring extra computational power or increasing reconstruction time.
Improved subglottal pressure estimation from neck-surface vibration in healthy speakers producing non-modal phonation
Subglottal air pressure plays a major role in voice production and is a primary factor in controlling voice onset, offset, sound pressure level, glottal airflow, vocal fold collision pressures, and variations in fundamental frequency. Previous work has shown promise for the estimation of subglottal pressure from an unobtrusive miniature accelerometer sensor attached to the anterior base of the neck during typical modal voice production across multiple pitch and vowel contexts. This study expands on that work to incorporate additional accelerometer-based measures of vocal function to compensate for non-modal phonation characteristics and achieve an improved estimation of subglottal pressure. Subjects with normal voices repeated /p/-vowel syllable strings from loud-to-soft levels in multiple vowel contexts (/ɑ/, /i/, and /u/), pitch conditions (comfortable, lower than comfortable, higher than comfortable), and voice quality types (modal, breathy, strained, and rough). Subject-specific, stepwise regression models were constructed using root-mean-square (RMS) values of the accelerometer signal alone (baseline condition) and in combination with cepstral peak prominence, fundamental frequency, and glottal airflow measures derived using subglottal impedance-based inverse filtering. Five-fold cross-validation assessed the robustness of model performance using the root-mean-square error metric for each regression model. Each cross-validation fold exhibited up to a 25% decrease in prediction error when the model incorporated multidimensional aspects of the accelerometer signal compared with RMS-only models. Improved estimation of subglottal pressure for non-modal phonation was thus achievable, lending to future studies of subglottal pressure estimation in patients with voice disorders and in ambulatory voice recordings.
Automatic Assessment of Speech Impairment in Cantonese-speaking People with Aphasia
Aphasia is a common type of acquired language impairment resulting from dysfunction in specific brain regions. Analysis of narrative spontaneous speech, e.g., story-telling, is an essential component of standardized clinical assessment on people with aphasia (PWA). Subjective assessment by trained speech-language pathologists (SLP) have many limitations in efficiency, effectiveness and practicality. This paper describes a fully automated system for speech assessment of Cantonese-speaking PWA. A deep neural network (DNN) based automatic speech recognition (ASR) system is developed for aphasic speech by multi-task training with both in-domain and out-of-domain speech data. Story-level embedding and Siamese network are applied to derive robust text features, which can be used to quantify the difference between aphasic speech and unimpaired one. The proposed text features are combined with conventional acoustic features to cover different aspects of speech and language impairment in PWA. Experimental results show a high correlation between predicted scores and subject assessment scores. The best correlation value achieved with ASR-generated transcription is .827, as compared with .844 achieved with manual transcription. The Siamese network significantly outperforms story-level embedding in generating text features for automatic assessment.
A Review of Automated Speech and Language Features for Assessment of Cognitive and Thought Disorders
It is widely accepted that information derived from analyzing speech (the acoustic signal) and language production (words and sentences) serves as a useful window into the health of an individual's cognitive ability. In fact, most neuropsychological testing batteries have a component related to speech and language where clinicians elicit speech from patients for subjective evaluation across a broad set of dimensions. With advances in speech signal processing and natural language processing, there has been recent interest in developing tools to detect more subtle changes in cognitive-linguistic function. This work relies on extracting a set of features from recorded and transcribed speech for objective assessments of speech and language, early diagnosis of neurological disease, and tracking of disease after diagnosis. With an emphasis on cognitive and thought disorders, in this paper we provide a review of existing speech and language features used in this domain, discuss their clinical application, and highlight their advantages and disadvantages. Broadly speaking, the review is split into two categories: language features based on natural language processing and speech features based on speech signal processing. Within each category, we consider features that aim to measure complementary dimensions of cognitive-linguistics, including language diversity, syntactic complexity, semantic coherence, and timing. We conclude the review with a proposal of new research directions to further advance the field.
Distributed Differentially-Private Algorithms for Matrix and Tensor Factorization
In many signal processing and machine learning applications, datasets containing private information are held at different locations, requiring the development of distributed privacy-preserving algorithms. Tensor and matrix factorizations are key components of many processing pipelines. In the distributed setting, differentially private algorithms suffer because they introduce noise to guarantee privacy. This paper designs new and improved distributed and differentially private algorithms for two popular matrix and tensor factorization methods: principal component analysis (PCA) and orthogonal tensor decomposition (OTD). The new algorithms employ a correlated noise design scheme to alleviate the effects of noise and can achieve the same noise level as the centralized scenario. Experiments on synthetic and real data illustrate the regimes in which the correlated noise allows performance matching with the centralized setting, outperforming previous methods and demonstrating that meaningful utility is possible while guaranteeing differential privacy.
A Modulo-Based Architecture for Analog-to-Digital Conversion
Systems that capture and process analog signals must first acquire them through an analog-to-digital converter. While subsequent digital processing can remove statistical correlations present in the acquired data, the dynamic range of the converter is typically scaled to match that of the input analog signal. The present paper develops an approach for analog-to-digital conversion that aims at minimizing the number of bits per sample at the output of the converter. This is attained by reducing the dynamic range of the analog signal by performing a modulo operation on its amplitude, and then quantizing the result. While the converter itself is universal and agnostic of the statistics of the signal, the decoder operation on the output of the quantizer can exploit the statistical structure in order to unwrap the modulo folding. The performance of this method is shown to approach information theoretical limits, as captured by the rate-distortion function, in various settings. An architecture for modulo analog-to-digital conversion via ring oscillators is suggested, and its merits are numerically demonstrated.
Graph Frequency Analysis of Brain Signals
This paper presents methods to analyze functional brain networks and signals from graph spectral perspectives. The notion of frequency and filters traditionally defined for signals supported on regular domains such as discrete time and image grids has been recently generalized to irregular graph domains, and defines brain graph frequencies associated with different levels of spatial smoothness across the brain regions. Brain network frequency also enables the decomposition of brain signals into pieces corresponding to smooth or rapid variations. We relate graph frequency with principal component analysis when the networks of interest denote functional connectivity. The methods are utilized to analyze brain networks and signals as subjects master a simple motor skill. We observe that brain signals corresponding to different graph frequencies exhibit different levels of adaptability throughout learning. Further, we notice a strong association between graph spectral properties of brain networks and the level of exposure to tasks performed, and recognize the most contributing and important frequency signatures at different levels of task familiarity.
Blind Source Separation for Unimodal and Multimodal Brain Networks: A Unifying Framework for Subspace Modeling
In the past decade, numerous advances in the study of the human brain were fostered by successful applications of blind source separation (BSS) methods to a wide range of imaging modalities. The main focus has been on extracting "networks" represented as the underlying latent sources. While the broad success in learning latent representations from multiple datasets has promoted the wide presence of BSS in modern neuroscience, it also introduced a wide variety of objective functions, underlying graphical structures, and parameter constraints for each method. Such diversity, combined with a host of datatype-specific know-how, can cause a sense of disorder and confusion, hampering a practitioner's judgment and impeding further development. We organize the diverse landscape of BSS models by exposing its key features and combining them to establish a novel unifying view of the area. In the process, we unveil important connections among models according to their properties and subspace structures. Consequently, a high-level descriptive structure is exposed, ultimately helping practitioners select the right model for their applications. Equipped with that knowledge, we review the current state of BSS applications to neuroimaging. The gained insight into model connections elicits a broader sense of generalization, highlighting several directions for model development. In light of that, we discuss emerging multi-dataset multidimensional (MDM) models and summarize their benefits for the study of the healthy brain and disease-related changes.
Transmodal Learning of Functional Networks for Alzheimer's Disease Prediction
Functional connectivity describes neural activity from resting-state functional magnetic resonance imaging (rs-fMRI). This noninvasive modality is a promising imaging biomarker of neurodegenerative diseases, such as Alzheimer's disease (AD), where the connectome can be an indicator to assess and to understand the pathology. However, it only provides noisy measurements of brain activity. As a consequence, it has shown fairly limited discrimination power on clinical groups. So far, the reference functional marker of AD is the fluorodeoxyglucose positron emission tomography (FDG-PET). It gives a reliable quantification of metabolic activity, but it is costly and invasive. Here, our goal is to analyze AD populations solely based on rs-fMRI, as functional connectivity is correlated to metabolism. We introduce : leveraging a prior from one modality to improve results of another modality on different subjects. A metabolic prior is learned from an independent FDG-PET dataset to improve functional connectivity-based prediction of AD. The prior acts as a regularization of connectivity learning and improves the estimation of discriminative patterns from distinct rs-fMRI datasets. Our approach is a two-stage classification strategy that combines several seed-based connectivity maps to cover a large number of functional networks that identify AD physiopathology. Experimental results show that our transmodal approach increases classification accuracy compared to pure rs-fMRI approaches, without resorting to additional invasive acquisitions. The method successfully recovers brain regions known to be impacted by the disease.
Localizing Sources of Brain Disease Progression with Network Diffusion Model
Pinpointing the sources of dementia is crucial to the effective treatment of neurodegenerative diseases. In this paper, we propose a diffusion model with impulsive sources over the brain connectivity network to model the progression of brain atrophy. To reliably estimate the atrophy sources, we impose sparse regularization on the source distribution and solve the inverse problem with an efficient gradient descent method. We localize the possible origins of Alzheimer's disease (AD) based on a large set of repeated magnetic resonance imaging (MRI) scans in Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The distribution of the sources averaged over the sample population is evaluated. We find that the dementia sources have different concentrations in the brain lobes for AD patients and mild cognitive impairment (MCI) subjects, indicating possible switch of the dementia driving mechanism. Moreover, we demonstrate that we can effectively predict changes of brain atrophy patterns with the proposed model. Our work could help understand the dynamics and origin of dementia, as well as monitor the progression of the diseases in an early stage.
Leveraging Multi-Modal Sensing for Mobile Health: A Case Review in Chronic Pain
Active and passive mobile sensing has garnered much attention in recent years. In this paper, we focus on chronic pain measurement and management as a case application to exemplify the state of the art. We present a consolidated discussion on the leveraging of various sensing modalities along with modular server-side and on-device architectures required for this task. Modalities included are: activity monitoring from accelerometry and location sensing, audio analysis of speech, image processing for facial expressions as well as modern methods for effective patient self-reporting. We review examples that deliver actionable information to clinicians and patients while addressing privacy, usability, and computational constraints. We also discuss open challenges in the higher level inferencing of patient state and effective feedback with potential directions to address them. The methods and challenges presented here are also generalizable and relevant to a broad range of other applications in mobile sensing.
One-Class Classification-Based Real-Time Activity Error Detection in Smart Homes
Caring for individuals with dementia is frequently associated with extreme physical and emotional stress, which often leads to depression. Smart home technology and advances in machine learning techniques can provide innovative solutions to reduce caregiver burden. One key service that caregivers provide is prompting individuals with memory limitations to initiate and complete daily activities. We hypothesize that sensor technologies combined with machine learning techniques can automate the process of providing reminder-based interventions. The first step towards automated interventions is to detect when an individual faces difficulty with activities. We propose machine learning approaches based on one-class classification that learn normal activity patterns. When we apply these classifiers to activity patterns that were not seen before, the classifiers are able to detect activity errors, which represent potential prompt situations. We validate our approaches on smart home sensor data obtained from older adult participants, some of whom faced difficulties performing routine activities and thus committed errors.
Discovering Multidimensional Motifs in Physiological Signals for Personalized Healthcare
Personalized diagnosis and therapy requires monitoring patient activity using various body sensors. Sensor data generated during personalized exercises or tasks may be too specific or inadequate to be evaluated using supervised methods such as classification. We propose multidimensional motif (MDM) discovery as a means for patient activity monitoring, since such motifs can capture repeating patterns across multiple dimensions of the data, and can serve as conformance indicators. Previous studies pertaining to mining MDMs have proposed approaches that lack the capability of concurrently processing multiple dimensions, thus limiting their utility in online scenarios. In this paper, we propose an efficient real-time approach to MDM discovery in body sensor generated time series data for monitoring performance of patients during therapy. We present two alternative models for MDMs based on motif co-occurrences and temporal ordering among motifs across multiple dimensions, with detailed formulation of the concepts proposed. The proposed method uses an efficient hashing based record to enable speedy update and retrieval of motif sets, and identification of MDMs. Performance evaluation using synthetic and real body sensor data in unsupervised motif discovery tasks shows that the approach is effective for (a) concurrent processing of multidimensional time series information suitable for real-time applications, (b) finding unknown naturally occurring patterns with minimal delay, and
Beyond Low Rank + Sparse: Multi-scale Low Rank Matrix Decomposition
We present a natural generalization of the recent low rank + sparse matrix decomposition and consider the decomposition of matrices into components of multiple scales. Such decomposition is well motivated in practice as data matrices often exhibit local correlations in multiple scales. Concretely, we propose a multi-scale low rank modeling that represents a data matrix as a sum of block-wise low rank matrices with increasing scales of block sizes. We then consider the inverse problem of decomposing the data matrix into its multi-scale low rank components and approach the problem via a convex formulation. Theoretically, we show that under various incoherence conditions, the convex program recovers the multi-scale low rank components either exactly or approximately. Practically, we provide guidance on selecting the regularization parameters and incorporate cycle spinning to reduce blocking artifacts. Experimentally, we show that the multi-scale low rank decomposition provides a more intuitive decomposition than conventional low rank methods and demonstrate its effectiveness in four applications, including illumination normalization for face images, motion separation for surveillance videos, multi-scale modeling of the dynamic contrast enhanced magnetic resonance imaging and collaborative filtering exploiting age information.