JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY

Limitations of CNNs for Approximating the Ideal Observer Despite Quantity of Training Data or Depth of Network
Omer K, Caucci L and Kupinski M
The performance of a convolutional neural network (CNN) on an image texture detection task as a function of linear image processing and the number of training images is investigated. Performance is quantified by the area under (AUC) the receiver operating characteristic (ROC) curve. The Ideal Observer (IO) maximizes AUC but depends on high-dimensional image likelihoods. In many cases, the CNN performance can approximate the IO performance. This work demonstrates counterexamples where a full-rank linear transform degrades the CNN performance below the IO in the limit of large quantities of training data and network layers. A subsequent linear transform changes the images' correlation structure, improves the AUC, and again demonstrates the CNN dependence on linear processing. Compression strictly decreases or maintains the IO detection performance while compression can increase the CNN performance especially for small quantities of training data. Results indicate an optimal compression ratio for the CNN based on task difficulty, compression method, and number of training images.
A Pilot Study on EEG-Based Evaluation of Visually Induced Motion Sickness
Liu R, Xu M, Zhang Y, Peli E and Hwang AD
The most prominent problem in virtual reality (VR) technology is that users may experience motion sickness-like symptoms when they immerse into a VR environment. These symptoms are recognized as visually induced motion sickness (VIMS) or virtual reality motion sickness (VRMS). The objectives of this study were to investigate the association between the electroencephalogram (EEG) and subjectively rated VIMS level (VIMSL) and find the EEG markers for VIMS evaluation. In this study, a VR-based vehicle-driving simulator (VDS) was used to induce VIMS symptoms, and a wearable EEG device with four electrodes, the Muse, was used to collect EEG data of subjects. Our results suggest that individual tolerance, susceptibility, and recoverability to VIMS varied largely among subjects; the following markers were shown to be significantly different from no-VIMS and VIMS states (P < 0.05): (1) Means of gravity frequency (GF) for theta@FP1, alpha@TP9, alpha@FP2, alpha@TP10, and beta@FP1; (2) Standard deviation of GF for alpha@TP9, alpha@FP1, alpha@FP2, alpha@TP10, and alpha@(FP2-FP1); (3) Standard deviation of power spectral entropy (PSE) for FP1; (4) Means of Kolmogorov complexity (KC) for TP9, FP1, and FP2. These results also demonstrate that it is feasible to perform VIMS evaluation using an EEG device with a small number of electrodes.
RemBrain: Exploring Dynamic Biospatial Networks with Mosaic Matrices and Mirror Glyphs
Ma C, Pellolio F, Llano DA, Stebbings KA, Kenyon RV and Marai GE
We introduce a web-based visual comparison approach for the systematic exploration of dynamic activation networks across biological datasets. Understanding the dynamics of such networks in the context of demographic factors like age is a fundamental problem in computational systems biology and neuroscience. We design visual encodings for the dynamic and community characteristics of these temporal networks. Our multi-scale approach blends nested mosaic matrices that capture temporal characteristics of the data, spatial views of the network data, Kiviat diagrams and mirror glyphs that detail the temporal behavior and community assignment of specific nodes. A top design specifically targeted at pairwise visual comparison further supports the comparative analysis of multiple dataset activations. We demonstrate the effectiveness of this approach through a case study on mouse brain network data. Domain expert feedback indicates this approach can help identify trends and anomalies in the data.
Segmentation of Brain Immunohistochemistry Images Using Clustering of Linear Centroids and Regional Shapes
Wu HS, Murray J and Morgello S
A generalized clustering algorithm utilizing the geometrical shapes of clusters for segmentation of colored brain immunohistological images is presented. To simplify the computation, the dimension of vectors composed from the pixel RGB components is reduced from three to two by applying a de-correlation mapping with the orthogonal bases of the eigenvectors of the auto-covariance matrix. Since the brain immunohistochemical images have stretched clusters that appear long and narrow in geometrical shape, we use centroids of straight lines instead of single points to approximate the clusters. An iterative algorithm is developed to optimize the linear centroids by minimizing the approximation mean-squared error. The partitioning of the two-dimensional vector domain into three portions classifies each image pixel into one of the three classes: The microglial cell cytoplasm, the combined hematoxylin stained cell nuclei and the neuropil, and the pale background. Regions of the combined hematoxylin stained cell nuclei and the neuropil are to be separated based on the differences in their regional shapes. The segmentation results of real immunohistochemical images of brain microglia are provided and discussed.
Multi-GPU Acceleration of Branchless Distance Driven Projection and Backprojection for Clinical Helical CT
Mitra A, Politte DG, Whiting BR, Williamson JF and O'Sullivan JA
Model-based image reconstruction (MBIR) techniques have the potential to generate high quality images from noisy measurements and a small number of projections which can reduce the x-ray dose in patients. These MBIR techniques rely on projection and backprojection to refine an image estimate. One of the widely used projectors for these modern MBIR based technique is called branchless distance driven (DD) projection and backprojection. While this method produces superior quality images, the computational cost of iterative updates keeps it from being ubiquitous in clinical applications. In this paper, we provide several new parallelization ideas for concurrent execution of the DD projectors in multi-GPU systems using CUDA programming tools. We have introduced some novel schemes for dividing the projection data and image voxels over multiple GPUs to avoid runtime overhead and inter-device synchronization issues. We have also reduced the complexity of overlap calculation of the algorithm by eliminating the common projection plane and directly projecting the detector boundaries onto image voxel boundaries. To reduce the time required for calculating the overlap between the detector edges and image voxel boundaries, we have proposed a pre-accumulation technique to accumulate image intensities in perpendicular 2D image slabs (from a 3D image) before projection and after backprojection to ensure our DD kernels run faster in parallel GPU threads. For the implementation of our iterative MBIR technique we use a parallel multi-GPU version of the alternating minimization (AM) algorithm with penalized likelihood update. The time performance using our proposed reconstruction method with Siemens Sensation 16 patient scan data shows an average of 24 times speedup using a single TITAN X GPU and 74 times speedup using 3 TITAN X GPUs in parallel for combined projection and backprojection.
Stereoscopic 3D Optic Flow Distortions Caused by Mismatches between Image Acquisition and Display Parameters
Hwang AD and Peli E
We analyze the impact of common stereoscopic 3D (S3D) depth distortion on S3D optic flow in virtual reality (VR) environments. The depth distortion is introduced by mismatches between the image acquisition and display parameter. The results show that such S3D distortions induce large S3D optic flow distortions and may even induce partial/full optic flow reversal within a certain depth range, depending on the viewer's moving speed and the magnitude of S3D distortion introduced. We hypothesize that the S3D optic flow distortion may be a source of intra-sensory conflict that may be a source of visually induced motion sickness (VIMS) in S3D.
Digital Modeling on Large Kernel Metamaterial Neural Network
Liu Q, Zheng H, Swartz BT, Lee HH, Asad Z, Kravchenko I, Valentine JG and Huo Y
Deep neural networks (DNNs) utilized recently are physically deployed with computational units (e.g., CPUs and GPUs). Such a design might lead to a heavy computational burden, significant latency, and intensive power consumption, which are critical limitations in applications such as the Internet of Things (IoT), edge computing, and the usage of drones. Recent advances in optical computational units (e.g., metamaterial) have shed light on energy-free and light-speed neural networks. However, the digital design of the metamaterial neural network (MNN) is fundamentally limited by its physical limitations, such as precision, noise, and bandwidth during fabrication. Moreover, the unique advantages of MNN's (e.g., light-speed computation) are not fully explored via standard 3×3 convolution kernels. In this paper, we propose a novel large kernel metamaterial neural network (LMNN) that maximizes the digital capacity of the state-of-the-art (SOTA) MNN with model re-parametrization and network compression, while also considering the optical limitation explicitly. The new digital learning scheme can maximize the learning capacity of MNN while modeling the physical restrictions of meta-optic. With the proposed LMNN, the computation cost of the convolutional front-end can be offloaded into fabricated optical hardware. The experimental results on two publicly available datasets demonstrate that the optimized hybrid design improved classification accuracy while reducing computational latency. The development of the proposed LMNN is a promising step towards the ultimate goal of energy-free and light-speed AI.