COMPUTERIZED MEDICAL IMAGING AND GRAPHICS

AFSegNet: few-shot 3D ankle-foot bone segmentation via hierarchical feature distillation and multi-scale attention and fusion
Huang Y, Holcombe SA, Wang SC and Tang J
Accurate segmentation of ankle and foot bones from CT scans is essential for morphological analysis. Ankle and foot bone segmentation challenges due to the blurred bone boundaries, narrow inter-bone gaps, gaps in the cortical shell, and uneven spongy bone textures. Our study endeavors to create a deep learning framework that harnesses advantages of 3D deep learning and tackles the hurdles in accurately segmenting ankle and foot bones from clinical CT scans. A few-shot framework AFSegNet is proposed considering the computational cost, which comprises three 3D deep-learning networks adhering to the principles of progressing from simple to complex tasks and network structures. Specifically, a shallow network first over-segments the foreground, and along with the foreground ground truth are used to supervise a subsequent network to detect the over-segmented regions, which are overwhelmingly inter-bone gaps. The foreground and inter-bone gap probability map are then input into a network with multi-scale attentions and feature fusion, a loss function combining region-, boundary-, and topology-based terms to get the fine-level bone segmentation. AFSegNet is applied to the 16-class segmentation task utilizing 123 in-house CT scans, which only requires a GPU with 24 GB memory since the three sub-networks can be successively and individually trained. AFSegNet achieves a Dice of 0.953 and average surface distance of 0.207. The ablation study and comparison with two basic state-of-the-art networks indicates the effectiveness of the progressively distilled features, attention and feature fusion modules, and hybrid loss functions, with the mean surface distance error decreased up to 50 %.
DSIFNet: Implicit feature network for nasal cavity and vestibule segmentation from 3D head CT
Lu Y, Gao H, Qiu J, Qiu Z, Liu J and Bai X
This study is dedicated to accurately segment the nasal cavity and its intricate internal anatomy from head CT images, which is critical for understanding nasal physiology, diagnosing diseases, and planning surgeries. Nasal cavity and it's anatomical structures such as the sinuses, and vestibule exhibit significant scale differences, with complex shapes and variable microstructures. These features require the segmentation method to have strong cross-scale feature extraction capabilities. To effectively address this challenge, we propose an image segmentation network named the Deeply Supervised Implicit Feature Network (DSIFNet). This network uniquely incorporates an Implicit Feature Function Module Guided by Local and Global Positional Information (LGPI-IFF), enabling effective fusion of features across scales and enhancing the network's ability to recognize details and overall structures. Additionally, we introduce a deep supervision mechanism based on implicit feature functions in the network's decoding phase, optimizing the utilization of multi-scale feature information, thus improving segmentation precision and detail representation. Furthermore, we constructed a dataset comprising 7116 CT volumes (including 1,292,508 slices) and implemented PixPro-based self-supervised pretraining to utilize unlabeled data for enhanced feature extraction. Our tests on nasal cavity and vestibule segmentation, conducted on a dataset comprising 128 head CT volumes (including 34,006 slices), demonstrate the robustness and superior performance of proposed method, achieving leading results across multiple segmentation metrics.
Exploring transformer reliability in clinically significant prostate cancer segmentation: A comprehensive in-depth investigation
Andrade-Miranda G, Vega PS, Taguelmimt K, Dang HP, Visvikis D and Bert J
Despite the growing prominence of transformers in medical image segmentation, their application to clinically significant prostate cancer (csPCa) has been overlooked. Minimal attention has been paid to domain shift analysis and uncertainty assessment, critical for safely implementing computer-aided diagnosis (CAD) systems. Domain shift in medical imagery refers to differences between the data used to train a model and the data evaluated later, arising from variations in imaging equipment, protocols, patient populations, and acquisition noise. While recent models enhance in-domain performance, areas such as robustness and uncertainty estimation in out-of-domain distributions have received limited investigation, creating indecisiveness about model reliability. In contrast, our study addresses csPCa at voxel, lesion, and image levels, investigating models from traditional U-Net to cutting-edge transformers. We focus on four key points: robustness, calibration, out-of-distribution (OOD), and misclassification detection (MD). Findings show that transformer-based models exhibit enhanced robustness at image and lesion levels, both in and out of domain. However, this improvement is not fully translated to the voxel level, where Convolutional Neural Networks (CNNs) outperform in most robustness metrics. Regarding uncertainty, hybrid transformers and transformer encoders performed better, but this trend depends on misclassification or out-of-distribution tasks.
Self-supervised multi-modal feature fusion for predicting early recurrence of hepatocellular carcinoma
Wang S, Zhao Y, Li J, Yi Z, Li J, Zuo C, Yao Y and Liu A
Surgical resection stands as the primary treatment option for early-stage hepatocellular carcinoma (HCC) patients. Postoperative early recurrence (ER) is a significant factor contributing to the mortality of HCC patients. Therefore, accurately predicting the risk of ER after curative resection is crucial for clinical decision-making and improving patient prognosis. This study leverages a self-supervised multi-modal feature fusion approach, combining multi-phase MRI and clinical features, to predict ER of HCC. Specifically, we utilized attention mechanisms to suppress redundant features, enabling efficient extraction and fusion of multi-phase features. Through self-supervised learning (SSL), we pretrained an encoder on our dataset to extract more generalizable feature representations. Finally, we achieved effective multi-modal information fusion via attention modules. To enhance explainability, we employed Score-CAM to visualize the key regions influencing the model's predictions. We evaluated the effectiveness of the proposed method on our dataset and found that predictions based on multi-phase feature fusion outperformed those based on single-phase features. Additionally, predictions based on multi-modal feature fusion were superior to those based on single-modal features.
Active learning based on multi-enhanced views for classification of multiple patterns in lung ultrasound images
Ni Y, Cong Y, Zhao C, Yu J, Wang Y, Zhou G and Shen M
There are several main patterns in lung ultrasound (LUS) images, including A-lines, B-lines, consolidation and pleural effusion. LUS images of healthy lungs typically only exhibit A-lines, while other patterns may emerge or coexist in LUS images associated with different lung diseases. The accurate categorization of these primary patterns is pivotal for effective lung disease screening. However, two challenges complicate the classification task: the first is the inherent blurring of feature differences between main patterns due to ultrasound imaging properties; and the second is the potential coexistence of multiple patterns in a single case, with only the most dominant pattern being clinically annotated. To address these challenges, we propose the active learning based on multi-enhanced views (MEVAL) method to achieve more precise pattern classification in LUS. To accentuate feature differences between multiple patterns, we introduce a feature enhancement module by applying vertical linear fitting and k-means clustering. The multi-enhanced views are then employed in parallel with the original images, thus enhancing MEVAL's awareness of feature differences between multiple patterns. To tackle the patterns coexistence issue, we propose an active learning strategy based on confidence sets and misclassified sets. This strategy enables the network to simultaneously recognize multiple patterns by selectively labeling of a small number of images. Our dataset comprises 5075 LUS images, with approximately 4% exhibiting multiple patterns. Experimental results showcase the effectiveness of the proposed method in the classification task, with accuracy of 98.72%, AUC of 0.9989, sensitivity of 98.76%, and specificity of 98.16%, which outperforms than the state-of-the-art deep learning-based methods. A series of comprehensive ablation studies suggest the effectiveness of each proposed component and show great potential in clinical application.
VLFATRollout: Fully transformer-based classifier for retinal OCT volumes
Oghbaie M, Araújo T, Schmidt-Erfurth U and Bogunović H
Despite the promising capabilities of 3D transformer architectures in video analysis, their application to high-resolution 3D medical volumes encounters several challenges. One major limitation is the high number of 3D patches, which reduces the efficiency of the global self-attention mechanisms of transformers. Additionally, background information can distract vision transformers from focusing on crucial areas of the input image, thereby introducing noise into the final representation. Moreover, the variability in the number of slices per volume complicates the development of models capable of processing input volumes of any resolution while simple solutions like subsampling may risk losing essential diagnostic details.
RPDNet: A reconstruction-regularized parallel decoders network for rectal tumor and rectum co-segmentation
Huang W, Xu Y, Wang Y, Zheng H and Guo Y
Accurate segmentation of rectal cancer tumor and rectum in magnetic resonance imaging (MRI) is significant for tumor precise diagnosis and treatment plans determination. Variable shapes and unclear boundaries of rectal tumors make this task particularly challenging. Only a few studies have explored deep learning networks in rectal tumor segmentation, which mainly adopt the classical encoder-decoder structure. The frequent downsampling operations during feature extraction result in the loss of detailed information, limiting the network's ability to precisely capture the shape and boundary of rectal tumors. This paper proposes a Reconstruction-regularized Parallel Decoder network (RPDNet) to address the problem of information loss and obtain accurate co-segmentation results of both rectal tumor and rectum. RPDNet initially establishes a shared encoder and parallel decoders framework to fully utilize the common knowledge between two segmentation labels while reducing the number of network parameters. An auxiliary reconstruction branch is subsequently introduced by calculating the consistency loss between the reconstructed and input images to preserve sufficient anatomical structure information. Moreover, a non-parameter target-adaptive attention module is proposed to distinguish the unclear boundary by enhancing the feature-level contrast between rectal tumors and normal tissues. The experimental results indicate that the proposed method outperforms state-of-the-art approaches in rectal tumor and rectum segmentation tasks, with Dice coefficients of 84.91 % and 90.36 %, respectively, demonstrating its potential application value in clinical practice.
A review of AutoML optimization techniques for medical image applications
Ali MJ, Essaid M, Moalic L and Idoumghar L
Automatic analysis of medical images using machine learning techniques has gained significant importance over the years. A large number of approaches have been proposed for solving different medical image analysis tasks using machine learning and deep learning approaches. These approaches are quite effective thanks to their ability to analyze large volume of medical imaging data. Moreover, they can also identify patterns that may be difficult for human experts to detect. Manually designing and tuning the parameters of these algorithms is a challenging and time-consuming task. Furthermore, designing a generalized model that can handle different imaging modalities is difficult, as each modality has specific characteristics. To solve these problems and automate the whole pipeline of different medical image analysis tasks, numerous Automatic Machine Learning (AutoML) techniques have been proposed. These techniques include Hyper-parameter Optimization (HPO), Neural Architecture Search (NAS), and Automatic Data Augmentation (ADA). This study provides an overview of several AutoML-based approaches for different medical imaging tasks in terms of optimization search strategies. The usage of optimization techniques (evolutionary, gradient-based, Bayesian optimization, etc.) is of significant importance for these AutoML approaches. We comprehensively reviewed existing AutoML approaches, categorized them, and performed a detailed analysis of different proposed approaches. Furthermore, current challenges and possible future research directions are also discussed.
Robust brain MRI image classification with SIBOW-SVM
Zeng L and Zhang HH
Primary Central Nervous System tumors in the brain are among the most aggressive diseases affecting humans. Early detection and classification of brain tumor types, whether benign or malignant, glial or non-glial, is critical for cancer prevention and treatment, ultimately improving human life expectancy. Magnetic Resonance Imaging (MRI) is the most effective technique for brain tumor detection, generating comprehensive brain scans. However, human examination can be error-prone and inefficient due to the complexity, size, and location variability of brain tumors. Recently, automated classification techniques using machine learning methods, such as Convolutional Neural Networks (CNNs), have demonstrated significantly higher accuracy than manual screening. However, deep learning-based image classification methods, including CNNs, face challenges in estimating class probabilities without proper model calibration (Guo et al., 2017; Minderer et al., 2021). In this paper, we propose a novel brain tumor image classification method called SIBOW-SVM, which integrates the Bag-of-Features model with SIFT feature extraction and weighted Support Vector Machines. This new approach can effectively extract hidden image features, enabling differentiation of various tumor types, provide accurate label predictions, and estimate probabilities of images belonging to each class, offering high-confidence classification decisions. We have also developed scalable and parallelable algorithms to facilitate the practical implementation of SIBOW-SVM for massive image datasets. To benchmark our method, we apply SIBOW-SVM to a public dataset of brain tumor MRI images containing four classes: glioma, meningioma, pituitary, and normal. Our results demonstrate that the new method outperforms state-of-the-art techniques, including CNNs, in terms of uncertainty quantification, classification accuracy, computational efficiency, and data robustness.
Prior knowledge-guided vision-transformer-based unsupervised domain adaptation for intubation prediction in lung disease at one week
Yang J, Henao JAG, Dvornek N, He J, Bower DV, Depotter A, Bajercius H, de Mortanges AP, You C, Gange C, Ledda RE, Silva M, Dela Cruz CS, Hautz W, Bonel HM, Reyes M, Staib LH, Poellinger A and Duncan JS
Data-driven approaches have achieved great success in various medical image analysis tasks. However, fully-supervised data-driven approaches require unprecedentedly large amounts of labeled data and often suffer from poor generalization to unseen new data due to domain shifts. Various unsupervised domain adaptation (UDA) methods have been actively explored to solve these problems. Anatomical and spatial priors in medical imaging are common and have been incorporated into data-driven approaches to ease the need for labeled data as well as to achieve better generalization and interpretation. Inspired by the effectiveness of recent transformer-based methods in medical image analysis, the adaptability of transformer-based models has been investigated. How to incorporate prior knowledge for transformer-based UDA models remains under-explored. In this paper, we introduce a prior knowledge-guided and transformer-based unsupervised domain adaptation (PUDA) pipeline. It regularizes the vision transformer attention heads using anatomical and spatial prior information that is shared by both the source and target domain, which provides additional insight into the similarity between the underlying data distribution across domains. Besides the global alignment of class tokens, it assigns local weights to guide the token distribution alignment via adversarial training. We evaluate our proposed method on a clinical outcome prediction task, where Computed Tomography (CT) and Chest X-ray (CXR) data are collected and used to predict the intubation status of patients in a week. Abnormal lesions are regarded as anatomical and spatial prior information for this task and are annotated in the source domain scans. Extensive experiments show the effectiveness of the proposed PUDA method.
Computational modeling of tumor invasion from limited and diverse data in Glioblastoma
Jonnalagedda P, Weinberg B, Min TL, Bhanu S and Bhanu B
For diseases with high morbidity rates such as Glioblastoma Multiforme, the prognostic and treatment planning pipeline requires a comprehensive analysis of imaging, clinical, and molecular data. Many mutations have been shown to correlate strongly with the median survival rate and response to therapy of patients. Studies have demonstrated that these mutations manifest as specific visual biomarkers in tumor imaging modalities such as MRI. To minimize the number of invasive procedures on a patient and for the overall resource optimization for the prognostic and treatment planning process, the correlation of imaging and molecular features has garnered much interest. While the tumor mass is the most significant feature, the impacted tissue surrounding the tumor is also a significant biomarker contributing to the visual manifestation of mutations - which has not been studied as extensively. The pattern of tumor growth impacts the surrounding tissue accordingly, which is a reflection of tumor properties as well. Modeling how the tumor growth impacts the surrounding tissue can reveal important information about the patterns of tumor enhancement, which in turn has significant diagnostic and prognostic value. This paper presents the first work to automate the computational modeling of the impacted tissue surrounding the tumor using generative deep learning. The paper isolates and quantifies the impact of the Tumor Invasion (TI) on surrounding tissue based on change in mutation status, subsequently assessing its prognostic value. Furthermore, a TI Generative Adversarial Network (TI-GAN) is proposed to model the tumor invasion properties. Extensive qualitative and quantitative analyses, cross-dataset testing, and radiologist blind tests are carried out to demonstrate that TI-GAN can realistically model the tumor invasion under practical challenges of medical datasets such as limited data and high intra-class heterogeneity.
WISE: Efficient WSI selection for active learning in histopathology
Kang H, Kim M, Ko YS, Cho Y and Yi MY
Deep neural network (DNN) models have been applied to a wide variety of medical image analysis tasks, often with the successful performance outcomes that match those of medical doctors. However, given that even minor errors in a model can impact patients' life, it is critical that these models are continuously improved. Hence, active learning (AL) has garnered attention as an effective and sustainable strategy for enhancing DNN models for the medical domain. Extant AL research in histopathology has primarily focused on patch datasets derived from whole-slide images (WSIs), a standard form of cancer diagnostic images obtained from a high-resolution scanner. However, this approach has failed to address the selection of WSIs, which can impede the performance improvement of deep learning models and increase the number of WSIs needed to achieve the target performance. This study introduces a WSI-level AL method, termed WSI-informative selection (WISE). WISE is designed to select informative WSIs using a newly formulated WSI-level class distance metric. This method aims to identify diverse and uncertain cases of WSIs, thereby contributing to model performance enhancement. WISE demonstrates state-of-the-art performance across the Colon and Stomach datasets, collected in the real world, as well as the public DigestPath dataset, significantly reducing the required number of WSIs by more than threefold compared to the one-pool dataset setting, which has been dominantly used in the field.
Detecting thyroid nodules along with surrounding tissues and tracking nodules using motion prior in ultrasound videos
Gao S, Li Y and Luo H
Ultrasound examination plays a crucial role in the clinical diagnosis of thyroid nodules. Although deep learning technology has been applied to thyroid nodule examinations, the existing methods all overlook the prior knowledge of nodules moving along a straight line in the video. We propose a new detection model, DiffusionVID-Line, and design a novel tracking algorithm, ByteTrack-Line, both of which fully leverage the prior knowledge of linear motion of nodules in thyroid ultrasound videos. Among them, ByteTrack-Line groups detected nodules, further reducing the workload of doctors and significantly improving their diagnostic speed and accuracy. In DiffusionVID-Line, we propose two new modules: Freq-FPN and Attn-Line. Freq-FPN module is used to extract frequency features, taking advantage of these features to reduce the impact of image blur in ultrasound videos. Based on the standard practice of segmented scanning by doctors, Attn-Line module enhances the attention on targets moving along a straight line, thus improving the accuracy of detection. In ByteTrack-Line, considering the characteristic of linear motion of nodules, we propose the Match-Line association module, which reduces the number of nodule ID switches. In the testing of the detection and tracking datasets, DiffusionVID-Line achieved a mean Average Precision (mAP50) of 74.2 for multiple tissues and 85.6 for nodules, while ByteTrack-Line achieved a Multiple Object Tracking Accuracy (MOTA) of 83.4. Both nodule detection and tracking have achieved state-of-the-art performance.
RibFractureSys: A gem in the face of acute rib fracture diagnoses
Castro-Zunti R, Li K, Vardhan A, Choi Y, Jin GY and Ko SB
Rib fracture patients, common in trauma wards, have different mortality rates and comorbidities depending on how many and which ribs are fractured. This knowledge is therefore paramount to make accurate prognoses and prioritize patient care. However, tracking 24 ribs over upwards 200+ frames in a patient's scan is time-consuming and error-prone for radiologists, especially depending on their experience. We propose an automated, modular, three-stage solution to assist radiologists. Using 9 fully annotated patient scans, we trained a multi-class U-Net to segment rib lesions and common anatomical clutter. To recognize rib fractures and mitigate false positives, we fine-tuned a ResNet-based model using 5698 false positives, 2037 acute fractures, 4786 healed fractures, and 14,904 unfractured rib lesions. Using almost 200 patient cases, we developed a highly task-customized multi-object rib lesion tracker to determine which lesions in a frame belong to which of the 12 ribs on either side; bounding box intersection over union- and centroid-based tracking, a line-crossing methodology, and various heuristics were utilized. Our system accepts an axial CT scan and processes, labels, and color-codes the scan. Over an internal validation dataset of 1000 acute rib fracture and 1000 control patients, our system, assessed by a 3-year radiologist resident, achieved 96.1% and 97.3% correct fracture classification accuracy for rib fracture and control patients, respectively. However, 18.0% and 20.8% of these patients, respectively, had incorrect rib labeling. Percentages remained consistent across sex and age demographics. Labeling issues include anatomical clutter being mislabeled as ribs and ribs going unlabeled.
MultiNet 2.0: A lightweight attention-based deep learning network for stenosis measurement in carotid ultrasound scans and cardiovascular risk assessment
Biswas M, Saba L, Kalra M, Singh R, Fernandes E Fernandes J, Viswanathan V, Laird JR, Mantella LE, Johri AM, Fouda MM and Suri JS
Cardiovascular diseases (CVD) cause 19 million fatalities each year and cost nations billions of dollars. Surrogate biomarkers are established methods for CVD risk stratification; however, manual inspection is costly, cumbersome, and error-prone. The contemporary artificial intelligence (AI) tools for segmentation and risk prediction, including older deep learning (DL) networks employ simple merge connections which may result in semantic loss of information and hence low in accuracy.
Corrigendum to 'Development and evaluation of an integrated model based on a deep segmentation network and demography-added radiomics algorithm for segmentation and diagnosis of early lung adenocarcinoma' [Computerized Medical Imaging and Graphics Volume 109 (2023) 102299]
Lee J, Chun J, Kim H, Kim JS and Park SY
Machine learning-based diagnostics of capsular invasion in thyroid nodules with wide-field second harmonic generation microscopy
Padrez Y, Golubewa L, Timoshchenko I, Enache A, Eftimie LG, Hristu R and Rutkauskas D
Papillary thyroid carcinoma (PTC) is one of the most common, well-differentiated carcinomas of the thyroid gland. PTC nodules are often surrounded by a collagen capsule that prevents the spread of cancer cells. However, as the malignant tumor progresses, the integrity of this protective barrier is compromised, and cancer cells invade the surroundings. The detection of capsular invasion is, therefore, crucial for the diagnosis and the choice of treatment and the development of new approaches aimed at the increase of diagnostic performance are of great importance. In the present study, we exploited the wide-field second harmonic generation (SHG) microscopy in combination with texture analysis and unsupervised machine learning (ML) to explore the possibility of quantitative characterization of collagen structure in the capsule and designation of different capsule areas as either intact, disrupted by invasion, or apt to invasion. Two-step k-means clustering showed that the collagen capsules in all analyzed tissue sections were highly heterogeneous and exhibited distinct segments described by characteristic ML parameter sets. The latter allowed a structural interpretation of the collagen fibers at the sites of overt invasion as fragmented and curled fibers with rarely formed distributed networks. Clustering analysis also distinguished areas in the PTC capsule that were not categorized as invasion sites by the initial histopathological analysis but could be recognized as prospective micro-invasions after additional inspection. The characteristic features of suspicious and invasive sites identified by the proposed unsupervised ML approach can become a reliable complement to existing methods for diagnosing encapsulated PTC, increase the reliability of diagnosis, simplify decision making, and prevent human-related diagnostic errors. In addition, the proposed automated ML-based selection of collagen capsule images and exclusion of non-informative regions can greatly accelerate and simplify the development of reliable methods for fully automated ML diagnosis that can be integrated into clinical practice.
Distance guided generative adversarial network for explainable medical image classifications
Xiong X, Sun Y, Liu X, Ke W, Lam CT, Chen J, Jiang M, Wang M, Xie H, Tong T, Gao Q, Chen H and Tan T
Despite the potential benefits of data augmentation for mitigating data insufficiency, traditional augmentation methods primarily rely on prior intra-domain knowledge. On the other hand, advanced generative adversarial networks (GANs) generate inter-domain samples with limited variety. These previous methods make limited contributions to describing the decision boundaries for binary classification. In this paper, we propose a distance-guided GAN (DisGAN) that controls the variation degrees of generated samples in the hyperplane space. Specifically, we instantiate the idea of DisGAN by combining two ways. The first way is vertical distance GAN (VerDisGAN) where the inter-domain generation is conditioned on the vertical distances. The second way is horizontal distance GAN (HorDisGAN) where the intra-domain generation is conditioned on the horizontal distances. Furthermore, VerDisGAN can produce the class-specific regions by mapping the source images to the hyperplane. Experimental results show that DisGAN consistently outperforms the GAN-based augmentation methods with explainable binary classification. The proposed method can apply to different classification architectures and has the potential to extend to multi-class classification. We provide the code in https://github.com/yXiangXiong/DisGAN.
An anthropomorphic diagnosis system of pulmonary nodules using weak annotation-based deep learning
Xie L, Xu Y, Zheng M, Chen Y, Sun M, Archer MA, Mao W, Tong Y and Wan Y
The accurate categorization of lung nodules in CT scans is an essential aspect in the prompt detection and diagnosis of lung cancer. The categorization of grade and texture for nodules is particularly significant since it can aid radiologists and clinicians to make better-informed decisions concerning the management of nodules. However, currently existing nodule classification techniques have a singular function of nodule classification and rely on an extensive amount of high-quality annotation data, which does not meet the requirements of clinical practice. To address this issue, we develop an anthropomorphic diagnosis system of pulmonary nodules (PN) based on deep learning (DL) that is trained by weak annotation data and has comparable performance to full-annotation based diagnosis systems. The proposed system uses DL models to classify PNs (benign vs. malignant) with weak annotations, which eliminates the need for time-consuming and labor-intensive manual annotations of PNs. Moreover, the PN classification networks, augmented with handcrafted shape features acquired through the ball-scale transform technique, demonstrate capability to differentiate PNs with diverse labels, including pure ground-glass opacities, part-solid nodules, and solid nodules. Through 5-fold cross-validation on two datasets, the system achieved the following results: (1) an Area Under Curve (AUC) of 0.938 for PN localization and an AUC of 0.912 for PN differential diagnosis on the LIDC-IDRI dataset of 814 testing cases, (2) an AUC of 0.943 for PN localization and an AUC of 0.815 for PN differential diagnosis on the in-house dataset of 822 testing cases. In summary, our system demonstrates efficient localization and differential diagnosis of PNs in a resource limited environment, and thus could be translated into clinical use in the future.
MRI-based vector radiomics for predicting breast cancer HER2 status and its changes after neoadjuvant therapy
Zhang L, Cui QX, Zhou LQ, Wang XY, Zhang HX, Zhu YM, Sang XQ and Kuai ZX
To develop a novel MRI-based vector radiomic approach to predict breast cancer (BC) human epidermal growth factor receptor 2 (HER2) status (zero, low, and positive; task 1) and its changes after neoadjuvant therapy (NAT) (positive-to-positive, positive-to-negative, and positive-to-pathologic complete response; task 2).
BreasTDLUSeg: A coarse-to-fine framework for segmentation of breast terminal duct lobular units on histopathological whole-slide images
Lu Z, Tang K, Wu Y, Zhang X, An Z, Zhu X, Feng Q and Zhao Y
Automatic segmentation of breast terminal duct lobular units (TDLUs) on histopathological whole-slide images (WSIs) is crucial for the quantitative evaluation of TDLUs in the diagnostic and prognostic analysis of breast cancer. However, TDLU segmentation remains a great challenge due to its highly heterogeneous sizes, structures, and morphologies as well as the small areas on WSIs. In this study, we propose BreasTDLUSeg, an efficient coarse-to-fine two-stage framework based on multi-scale attention to achieve localization and precise segmentation of TDLUs on hematoxylin and eosin (H&E)-stained WSIs. BreasTDLUSeg consists of two networks: a superpatch-based patch-level classification network (SPPC-Net) and a patch-based pixel-level segmentation network (PPS-Net). SPPC-Net takes a superpatch as input and adopts a sub-region classification head to classify each patch within the superpatch as TDLU positive or negative. PPS-Net takes the TDLU positive patches derived from SPPC-Net as input. PPS-Net deploys a multi-scale CNN-Transformer as an encoder to learn enhanced multi-scale morphological representations and an upsampler to generate pixel-wise segmentation masks for the TDLU positive patches. We also constructed two breast cancer TDLU datasets containing a total of 530 superpatch images with patch-level annotations and 2322 patch images with pixel-level annotations to enable the development of TDLU segmentation methods. Experiments on the two datasets demonstrate that BreasTDLUSeg outperforms other state-of-the-art methods with the highest Dice similarity coefficients of 79.97% and 92.93%, respectively. The proposed method shows great potential to assist pathologists in the pathological analysis of breast cancer. An open-source implementation of our approach can be found at https://github.com/Dian-kai/BreasTDLUSeg.