LightR-YOLOv5: A compact rotating detector for SARS-CoV-2 antigen-detection rapid diagnostic test results
Nucleic acid testing is currently the golden reference for coronaviruses (SARS-CoV-2) detection, while the SARS-CoV-2 antigen-detection rapid diagnostic tests (RDT) is an important adjunct. RDT can be widely used in the community or regional screening management as self-test tools and may need to be verified by healthcare authorities. However, manual verification of RDT results is a time-consuming task, and existing object detection algorithms usually suffer from high model complexity and computational effort, making them difficult to deploy. We propose LightR-YOLOv5, a compact rotating SARS-CoV-2 antigen-detection RDT results detector. Firstly, we employ an extremely light-weight L-ShuffleNetV2 network as a feature extraction network with a slight reduction in recognition accuracy. Secondly, we combine semantic and texture features in different layers by judiciously combining and employing GSConv, depth-wise convolution, and other modules, and further employ the NAM attention to locate the RDT result detection region. Furthermore, we propose a new data augmentation approach, Single-Copy-Paste, for increasing data samples for the specific task of RDT result detection while achieving a small improvement in model accuracy. Compared with some mainstream rotating object detection networks, the model size of our LightR-YOLOv5 is only 2.03MB, and it is 12.6%, 6.4%, and 7.3% higher in mAP@.5:.95 metrics compared to RetianNet, FCOS, and RDet, respectively.
COVID-19 chest X-ray image classification in the presence of noisy labels
The Corona Virus Disease 2019 (COVID-19) has been declared a worldwide pandemic, and a key method for diagnosing COVID-19 is chest X-ray imaging. The application of convolutional neural network with medical imaging helps to diagnose the disease accurately, where the label quality plays an important role in the classification problem of COVID-19 chest X-rays. However, most of the existing classification methods ignore the problem that the labels are hardly completely true and effective, and noisy labels lead to a significant degradation in the performance of image classification frameworks. In addition, due to the wide distribution of lesions and the large number of local features of COVID-19 chest X-ray images, existing label recovery algorithms have to face the bottleneck problem of the difficult reuse of noisy samples. Therefore, this paper introduces a general classification framework for COVID-19 chest X-ray images with noisy labels and proposes a noisy label recovery algorithm based on subset label iterative propagation and replacement (SLIPR). Specifically, the proposed algorithm first obtains random subsets of the samples multiple times. Then, it integrates several techniques such as principal component analysis, low-rank representation, neighborhood graph regularization, and k-nearest neighbor for feature extraction and image classification. Finally, multi-level weight distribution and replacement are performed on the labels to cleanse the noise. In addition, for the label-recovered dataset, high confidence samples are further selected as the training set to improve the stability and accuracy of the classification framework without affecting its inherent performance. In this paper, three typical datasets are chosen to conduct extensive experiments and comparisons of existing algorithms under different metrics. Experimental results on three publicly available COVID-19 chest X-ray image datasets show that the proposed algorithm can effectively recover noisy labels and improve the accuracy of the image classification framework by 18.9% on the Tawsifur dataset, 19.92% on the Skytells dataset, and 16.72% on the CXRs dataset. Compared to the state-of-the-art algorithms, the gain of classification accuracy of SLIPR on the three datasets can reach 8.67%-19.38%, and the proposed algorithm also has certain scalability while ensuring data integrity.
SELDNet: Sequenced encoder and lightweight decoder network for COVID-19 infection region segmentation
Segmenting regions of lung infection from computed tomography (CT) images shows excellent potential for rapid and accurate quantifying of Coronavirus disease 2019 (COVID-19) infection and determining disease development and treatment approaches. However, a number of challenges remain, including the complexity of imaging features and their variability with disease progression, as well as the high similarity to other lung diseases, which makes feature extraction difficult. To answer the above challenges, we propose a new sequence encoder and lightweight decoder network for medical image segmentation model (SELDNet). (i) Construct sequence encoders and lightweight decoders based on Transformer and deep separable convolution, respectively, to achieve different fine-grained feature extraction. (ii) Design a semantic association module based on cross-attention mechanism between encoder and decoder to enhance the fusion of different levels of semantics. The experimental results showed that the network can effectively achieve segmentation of COVID-19 infected regions. The dice of the segmentation result was 79.1%, the sensitivity was 76.3%, and the specificity was 96.7%. Compared with several state-of-the-art image segmentation models, our proposed SELDNet model achieves better results in the segmentation task of COVID-19 infected regions.
ViDMASK dataset for face mask detection with social distance measurement
The COVID-19 outbreak has extenuated the need for a monitoring system that can monitor face mask adherence and social distancing with the use of AI. With the existing video surveillance systems as base, a deep learning model is proposed for mask detection and social distance measurement. State-of-the-art object detection and recognition models such as Mask RCNN, YOLOv4, YOLOv5, and YOLOR were trained for mask detection and evaluated on the existing datasets and on a newly proposed video mask detection dataset the ViDMASK. The obtained results achieved a comparatively high mean average precision of 92.4% for YOLOR. After mask detection, the distance between people's faces is measured for high risk and low risk distance. Furthermore, the new large-scale mask dataset from videos named ViDMASK diversifies the subjects in terms of pose, environment, quality of image, and versatile subject characteristics, producing a challenging dataset. The tested models succeed in detecting the face masks with high performance on the existing dataset, MOXA. However, with the VIDMASK dataset, the performance of most models are less accurate because of the complexity of the dataset and the number of people in each scene. The link to ViDMask dataset and the base codes are available at https://github.com/ViDMask/VidMask-code.git.
Recognition efficiency of atypical cardiovascular readings on ECG devices through fogged goggles
In their continuing battle against the COVID-19 pandemic, medical workers in hospitals worldwide need to wear safety glasses and goggles to protect their eyes from the possible transmission of the virus. However, they work for long hours and need to wear a mask and other personal protective equipment, which causes their protective eye wear to fog up. This fogging up of eye wear, in turn, has a substantial impact in the speed and accuracy of reading information on the interface of electrocardiogram (ECG) machines. To gain a better understanding of the extent of the impact, this study experimentally simulates the fogging of protective goggles when viewing the interface with three variables: the degree of fogging of the goggles, brightness of the screen, and color of the font of the cardiovascular readings. This experimental study on the target recognition of digital font is carried out by simulating the interface of an ECG machine and readability of the ECG machine with fogged eye wear. The experimental results indicate that the fogging of the lenses has a significant impact on the recognition speed and the degree of fogging has a significant correlation with the font color and brightness of the screen. With a reduction in screen brightness, its influence on recognition speed shows a v-shaped trend, and the response time is the shortest when the screen brightness is 150 cd/m2. When eyewear is fogged, yellow and green font colors allow a quicker response with a higher accuracy. On the whole, the subjects show a better performance with the use of green font, but there are inconsistencies. In terms of the interaction among the three variables, the same results are also found and the same conclusion can be made accordingly. This research study can act as a reference for the interface design of medical equipment in events where medical staff wear protective eyewear for a long period of time.
COVID-19 CT image recognition algorithm based on transformer and CNN
Novel corona virus pneumonia (COVID-19) broke out in 2019, which had a great impact on the development of world economy and people's lives. As a new mainstream image processing method, deep learning network has been constructed to extract medical features from chest CT images, and has been used as a new detection method in clinical practice. However, due to the medical characteristics of COVID-19 CT images, the lesions are widely distributed and have many local features. Therefore, it is difficult to diagnose directly by using the existing deep learning model. According to the medical features of CT images in COVID-19, a parallel bi-branch model (Trans-CNN Net) based on Transformer module and Convolutional Neural Network module is proposed by making full use of the local feature extraction capability of Convolutional Neural Network and the global feature extraction advantage of Transformer. According to the principle of cross-fusion, a bi-directional feature fusion structure is designed, in which features extracted from two branches are fused bi-directionally, and the parallel structures of branches are fused by a feature fusion module, forming a model that can extract features of different scales. To verify the effect of network classification, the classification accuracy on COVIDx-CT dataset is 96.7%, which is obviously higher than that of typical CNN network (ResNet-152) (95.2%) and Transformer network (Deit-B) (75.8%). These results demonstrate accuracy is improved. This model also provides a new method for the diagnosis of COVID-19, and through the combination of deep learning and medical imaging, it promotes the development of real-time diagnosis of lung diseases caused by COVID-19 infection, which is helpful for reliable and rapid diagnosis, thus saving precious lives.
Sensitivity to Visual Speed Modulation in Head-Mounted Displays Depends on Fixation
A primary cause of simulator sickness in head-mounted displays (HMDs) is conflict between the visual scene displayed to the user and the visual scene expected by the brain when the user's head is in motion. It is useful to measure perceptual sensitivity to visual speed modulation in HMDs because conditions that minimize this sensitivity may prove less likely to elicit simulator sickness. In prior research, we measured sensitivity to visual gain modulation during slow, passive, full-body yaw rotations and observed that sensitivity was reduced when subjects fixated a head-fixed target compared with when they fixated a scene-fixed target. In the current study, we investigated whether this pattern of results persists when (1) movements are faster, active head turns, and (2) visual stimuli are presented on an HMD rather than on a monitor. Subjects wore an Oculus Rift CV1 HMD and viewed a 3D scene of white points on a black background. On each trial, subjects moved their head from a central position to face a 15° eccentric target. During the head movement they fixated a point that was either head-fixed or scene-fixed, depending on condition. They then reported if the visual scene motion was too fast or too slow. Visual speed on subsequent trials was modulated according to a staircase procedure to find the speed increment that was just noticeable. Sensitivity to speed modulation during active head movement was reduced during head-fixed fixation, similar to what we observed during passive whole-body rotation. We conclude that fixation of a head-fixed target is an effective way to reduce sensitivity to visual speed modulation in HMDs, and may also be an effective strategy to reduce susceptibility to simulator sickness.
Generating an image that affords slant perception from stereo, without pictorial cues
This paper describes an algorithm for generating a planar image that when tilted provides stereo cues to slant, without contamination from pictorial gradients. As the stimuli derived from this image are ultimately intended for use in studies of slant perception under magnification, a further requirement is that the generated image be suitable for high-definition printing or display on a monitor. A first stage generates an image consisting of overlapping edges with sufficient density that when zoomed, edges that nearly span the original scale are replaced with newly emergent content that leaves the visible edge statistics unchanged. A second stage reduces intensity clumping while preserving edges by enforcing a broad dynamic range across the image. Spectral analyses demonstrate that the low-frequency content of the resulting image, which would correspond to the pictorial cue of texture gradient changes under slant, (a) has a power fall-off deviating from 1/f noise (to which the visual system is particularly sensitive), and (b) does not offer systematic cues under changes in scale or slant. Two behavioral experiments tested whether the algorithm generates stimuli that offer cues to slant under stereo viewing only, and not when disparities are eliminated. With a particular adjustment of dynamic range (and nearly so with the other version that was tested), participants viewing without stereo cues were essentially unable to discriminate slanted from flat (frontal) stimuli, and when slant was reported, they failed to discriminate its direction. In contrast, non-stereo viewing of a control stimulus with pictorial cues, as well as stereoscopic observation, consistently allowed participants to perceive slant correctly. Experiment 2 further showed that these results generalized across a population of different stimuli from the same generation process and demonstrated that the process did not substitute biased slant cues.
Displays in space
This chapter describes the human and environmental factors that dictate the way that displays must be designed for, and used in space. A brief history of the evolution of such display systems covers developments from the Mercury rockets to the International Space Station.