Huanhuan Zhang
1,† and
Yufei Qie
National Key Laboratory of Antennas and Microwave Technology, Xidian University, Xi’an 710071, China Department of Industrial Engineering, Tsinghua University, Beijing 100084, China Author to whom correspondence should be addressed. These authors contributed equally to this work. Appl. Sci. 2023, 13(18), 10521; https://doi.org/10.3390/app131810521Submission received: 30 April 2023 / Revised: 20 May 2023 / Accepted: 8 September 2023 / Published: 21 September 2023
(This article belongs to the Special Issue Medical Imaging Using Machine Learning and Deep Learning)Deep learning (DL) has made significant strides in medical imaging. This review article presents an in-depth analysis of DL applications in medical imaging, focusing on the challenges, methods, and future perspectives. We discuss the impact of DL on the diagnosis and treatment of diseases and how it has revolutionized the medical imaging field. Furthermore, we examine the most recent DL techniques, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and generative adversarial networks (GANs), and their applications in medical imaging. Lastly, we provide insights into the future of DL in medical imaging, highlighting its potential advancements and challenges.
Medical imaging has been a critical component of modern healthcare, providing clinicians with vital information for the diagnosis, treatment, and monitoring of various diseases [1]. Traditional image analysis techniques often rely on handcrafted features and expert knowledge, which can be time-consuming and subject to human error [2]. In recent years, machine learning (ML) methods have been increasingly applied to medical image analysis to improve efficiency and reduce potential human errors. These methods, including Support Vector Machines (SVMs), decision trees, random forests, and logistic regression, have shown success in tasks such as image segmentation, object detection, and disease classification. These ML methods typically involve the manual selection and extraction of features from the medical images, which are then used for prediction or classification. With the rapid development of deep learning (DL) technologies, there has been a significant shift toward leveraging these powerful tools to improve the accuracy and efficiency of medical image analysis [3]. Unlike traditional ML methods, DL models are capable of automatically learning and extracting hierarchical features from raw data. Deep learning, a subfield of machine learning (ML), has made remarkable advancements in recent years, particularly in image recognition and natural language processing tasks [4]. This success is primarily attributed to the development of artificial neural networks (ANN) with multiple hidden layers, which allow for the automatic extraction and learning of hierarchical features from raw data [5]. Consequently, DL techniques and network-based computation have been widely adopted in various applications, including autonomous driving, robotics, natural language understanding [6], and a large number of engineering computation cases [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43].
In the medical imaging domain, DL has shown great potential for enhancing the quality of care and improving patient outcomes [44]. By automating the analysis of medical images, DL algorithms can aid in the early detection of diseases, streamline clinical workflows, and reduce the burden on healthcare professionals [45]. In addition, DL also plays a significant role in the credibility verification of reported medical data. For instance, it can be utilized to identify anomalies or inconsistencies in the data, thereby ensuring the reliability of the data used for diagnosis or treatment planning. DL models can also help in validating the authenticity of medical images, which is crucial in today’s digital age where data manipulation has become increasingly sophisticated. Moreover, DL models can be trained to predict disease progression and treatment response, thereby contributing to personalized medicine and the optimization of therapeutic strategies [46].
In our study, we specifically discuss the potential of DL models in medical imaging. We have discovered that deep learning techniques have been revolutionizing the medical imaging research. These findings underline the potential of DL techniques to further advance the field of medical imaging, opening new avenues for diagnosis and treatment strategies. This paper details these methods, results, and the implications of these findings for future research.
Several DL techniques have been applied to medical imaging [47,48,49,50,51,52], with convolutional neural networks (CNNs) being the most prevalent [53]. CNNs are particularly suited for image analysis tasks due to their ability to capture local spatial patterns and automatically learn hierarchical representations from input images [54]. Other DL techniques that have been applied to medical imaging include recurrent neural networks (RNNs), which are well-suited for handling sequential data, and generative adversarial networks (GANs), which can generate new samples from learned data distributions [55]. In assessing the performance of our DL models in medical image diagnosis, several evaluation metrics are commonly employed, including Receiver Operating Characteristic (ROC) curves and confusion matrices, among other techniques [1,2,3]. The ROC curve is a graphical plot that illustrates the diagnostic ability of our DL models as its discrimination threshold is varied. It presents the trade-off between sensitivity (or True Positive Rate) and specificity (1–False Positive Rate), providing a measure of how well our models distinguish between classes. The Area Under the ROC Curve (AUC) is also considered, which provides a single metric to compare model performance. On the other hand, confusion matrices provide a summary of prediction results on a classification problem. The number of correct and incorrect predictions is counted and broken down by each class. This offers a more granular view of the model performance, including metrics such as precision, recall, and F1-score, which are crucial when dealing with imbalanced classes.
There are various medical imaging modalities used in clinical practice, each providing unique information and serving specific diagnostic purposes [56]. Some of the most common modalities include magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), ultrasound imaging, and optical coherence tomography (OCT) [57], as shown in Figure 1. DL techniques have been successfully applied to these modalities for tasks such as image segmentation, classification, reconstruction, and registration [46].
Despite the promising results achieved by DL in medical imaging, several challenges remain [47,48,49,50,51,52]. One major challenge is the limited availability of annotated medical image datasets due to the time-consuming and costly nature of manual annotations [58]. Additionally, data privacy concerns and the sharing of sensitive patient information pose significant obstacles to the development of large-scale, multi-institutional datasets [59]. Another challenge is the interpretability of DL models, as they often act as “black boxes” that provide limited insights into their decision-making processes [60]. Ensuring the explainability and trustworthiness of these models is crucial for their adoption in clinical practice, as clinicians need to understand the rationale behind their predictions [61].
Despite these challenges, DL in medical imaging presents numerous opportunities for advancing healthcare and improving patient outcomes. With ongoing research, interdisciplinary collaboration, and the development of more sophisticated algorithms, DL has the potential to revolutionize medical imaging and contribute significantly to the future of medicine.
Deep learning techniques in medical imaging can serve a wide array of functions, both in terms of the acquisition of medical images and the identification of pathologies within these images. Specifically, these techniques are leveraged not only to enhance the quality of images obtained through various modalities but also to enable effective and efficient identification of pathological markers within these images. For example, convolutional neural networks (CNNs) can be used in the reconstruction of images from MRI scanners, enhancing the resolution of the obtained images and thereby allowing for a clearer visualization of potential pathologies [53]. Moreover, CNNs are particularly adept at analyzing these images postacquisition, identifying key features within these images that could point toward specific pathologies [54]. This dual functionality—improving the acquisition of images and aiding in the identification of pathologies—is a key strength of deep learning techniques in the field of medical imaging. Throughout this section, we will discuss three major types of deep learning techniques used in medical imaging: convolutional neural networks (CNNs), recurrent neural networks (RNNs), and generative adversarial networks (GANs). For each technique, we will detail its basic concepts, architecture and applications, role in image acquisition, and pathology detection, along with transfer learning approaches and the limitations and challenges faced.
Convolutional neural networks (CNNs) are a class of DL models designed specifically for image analysis tasks [4]. Its basic mechanism has been indicated in Figure 2. CNNs consist of multiple layers, including convolutional, pooling, and fully connected layers, which work together to learn hierarchical representations of input images [62]. Convolutional layers are responsible for extracting local features from images, such as edges, corners, and textures, while pooling layers help reduce the spatial dimensions of feature maps, improving computational efficiency and reducing overfitting [63]. Finally, fully connected layers enable the integration of local features into global patterns, enabling the network to perform image classification or other desired tasks [6].
Several CNN architectures have been proposed and widely adopted in medical imaging applications [64,65,66]. Some of the most notable architectures include LeNet [67], AlexNet [63], VGGNet [62], ResNet [53], and DenseNet [68]. These architectures have been applied to various medical imaging tasks, such as image segmentation, classification, detection, and registration [69,70].
Transfer learning is a popular approach in DL, where a pretrained model is fine-tuned for a new task or domain, leveraging the knowledge acquired during the initial training [71]. This technique is particularly useful in medical imaging [72,73,74], where annotated datasets are often limited in size [75]. By using pretrained models, researchers can take advantage of the general features learned by the model on a large dataset, such as ImageNet, and fine-tune it to perform well on a specific medical imaging task [76]. Transfer learning has been successfully applied in various medical imaging applications, including diagnosing diabetic retinopathy from retinal images, classifying skin cancer from dermoscopy images, and segmenting brain tumors from MRI scans [60,77,78,79].
RNNs are a class of DL models designed to handle sequential data [80]. Unlike feedforward neural networks, RNNs possess internal memory that enables them to maintain a hidden state across time steps, allowing them to learn patterns within sequences [81]. This property makes RNNs suitable for tasks that require processing time-dependent data, such as natural language processing, time-series prediction, and video analysis [82].
While RNNs are less commonly used in medical imaging compared to CNNs, they have shown potential in specific applications that involve sequential data. Some well-known RNN architectures include the basic RNN (shown in Figure 3), long short-term memory (LSTM) [80], and gated recurrent unit (GRU) [81]. These architectures have been employed in various medical imaging tasks [83,84,85], such as image captioning, video analysis, and multimodal data fusion [86]. For instance, RNNs have been used in conjunction with CNNs for medical image captioning, where the goal is to generate a descriptive text for a given image [87]. In this context, a CNN is used to extract features from the image, while an RNN is employed to generate a sequence of words based on the extracted features [88]. This approach has been applied to generate radiology reports for chest X-rays and MRI scans [89]. Additionally, RNNs have been utilized for analyzing medical videos, such as endoscopy and laparoscopy videos [90]. In these applications, RNNs can be used to track and analyze temporal changes in the videos, enabling tasks such as surgical tool tracking, tissue segmentation, and surgical workflow analysis [55,91,92,93,94].
Generative adversarial networks (GANs) has been regarded as a class of DL models designed for generating realistic samples from complex data distributions [95]. GANs consist of two neural networks, a generator and a discriminator, which are trained simultaneously in a minimax game [96]. The generator learns to produce samples that resemble the training data, while the discriminator learns to differentiate between real and generated samples. The training process continues until the generator produces samples that the discriminator cannot reliably distinguish from real data [97].
Several GAN architectures and variants have been proposed for various tasks, including deep convolutional GAN (DCGAN) [98], Wasserstein GAN (WGAN) [99], and CycleGAN [100]. The basic mechanism of GANs has been shown in Figure 4. GANs have shown promising results in medical imaging applications [101,102,103], such as image synthesis, data augmentation, and image-to-image translation [104]. For example, GANs have been used to synthesize realistic medical images, which can be valuable for training other DL models, especially when annotated data is scarce [105]. In this context, GANs have been employed to generate synthetic CT, MRI, and ultrasound images, among others [106]. Another application of GANs in medical imaging is data augmentation, where GANs are used to create additional training samples to improve model performance and generalization [62]. By generating diverse and realistic variations of the available data, GANs can help mitigate issues related to limited datasets in medical imaging contexts [107]. Image-to-image translation is another application where GANs have shown potential in medical imaging. In this task, GANs are employed to transform images from one modality or representation to another, such as converting MRI images to CT images or enhancing image quality [108]. For instance, CycleGAN has been used for cross-modality synthesis between MRI and CT images, enabling the generation of synthetic CT images from MRI scans, which can be helpful in situations where CT scans are unavailable or contraindicated [63].
Despite the successes and potential of deep learning techniques in medical imaging, several common limitations and challenges need to be addressed. These challenges span across convolutional neural networks (CNNs), recurrent neural networks (RNNs), and generative adversarial networks (GANs). Shown in Table 1, one primary challenge is the lack of interpretability in deep learning models. CNNs, RNNs, and GANs often act as “black boxes,” making it difficult to understand the underlying decision-making processes. This lack of interpretability hinders their adoption in clinical practice, where explainability is crucial. Another challenge lies in the robustness and security of deep learning models. CNNs are susceptible to adversarial examples, which are carefully crafted inputs designed to deceive the model into making incorrect predictions. Adversarial attacks raise concerns about the reliability and trustworthiness of deep learning models in medical imaging applications. Furthermore, deep learning techniques, including CNNs, RNNs, and GANs, require large amounts of annotated data for training. Acquiring labeled medical imaging datasets can be time-consuming, expensive, and sometimes limited in size. Overcoming the challenge of data scarcity and finding efficient ways to leverage unlabeled data, such as unsupervised or semisupervised learning, is essential for the broader adoption of deep learning in medical imaging. Additionally, both RNNs and GANs face specific challenges. RNNs suffer from the vanishing and exploding gradient problem when training deep networks, making it difficult to learn long-term dependencies in sequences. The computational complexity of RNNs is also a concern, especially when dealing with long sequences or large-scale datasets. For GANs, the mode collapse problem is a significant challenge, as it can lead to limited variety and suboptimal results in tasks such as data augmentation and image synthesis. Training GANs can be challenging due to unstable dynamics and convergence issues. Ensuring the quality and reliability of generated images is crucial for their safe and effective use in medical imaging applications. Addressing these limitations and challenges will enhance the interpretability, robustness, scalability, and applicability of deep learning techniques in medical imaging.
To offer a comprehensive perspective on the role of DL in medical imaging, it is crucial to consider its multifaceted applications beyond just image reconstruction and registration. As highlighted in the title, the intention of this review is not merely to focus on these two aspects but to present a wider perspective on how DL is revolutionizing the field of medical imaging.
In the following sections, we delve into the specifics of how DL techniques have been employed in diverse tasks such as image segmentation and classification (Section 3.1 and Section 3.2), in addition to reconstruction and registration (Section 3.3 and Section 3.4). Image segmentation, for instance, involves partitioning a digital image into multiple segments to simplify the image and/or to extract relevant information. DL has significantly improved the performance of these tasks, making it a vital component of modern medical imaging. Similarly, image classification, which is the task of assigning an input image to one label from a fixed set of categories, is another area where DL has shown great potential. These varied applications underscore the breadth of DL’s impact on medical imaging, and it is this breadth that we seek to convey through this review.
Image segmentation is a critical task in medical imaging, which involves partitioning an image into multiple regions or segments, each representing a specific anatomical structure or region of interest (ROI) [45]. DL has shown exceptional performance in this domain, with CNNs being the most commonly used approach. The U-Net [6] is a popular CNN architecture specifically designed for biomedical image segmentation, which has been applied to various medical imaging modalities such as MRI, CT, and microscopy images [45]. Additionally, multiscale architectures [109], attention mechanisms [110], and 3D CNNs [111] have been proposed to improve segmentation accuracy and efficiency in complex medical imaging tasks. Figure 5 shows the application of DL approaches to liver segmentation.
Despite the success of DL-based segmentation methods, several challenges remain. These include the need for large, annotated datasets, the limited interpretability of the models, and the robustness of the algorithms to variations in image quality, acquisition protocols, and patient populations [45,112]. Future research directions may focus on developing more efficient annotation techniques, incorporating domain knowledge into DL models, and improving the generalization capabilities of these models to unseen data or rare pathologies [113].
Image classification in medical imaging involves assigning a label to an input image, typically indicating the presence or absence of a specific condition or abnormality [114]. DL techniques, particularly CNNs, have demonstrated exceptional performance in image classification tasks [63]. Transfer learning, where pretrained models on large-scale natural image datasets (e.g., ImageNet) are fine-tuned on smaller medical imaging datasets, has been widely adopted to overcome the limitations of scarce labeled data in medical imaging [76]. Additionally, DL techniques such as DenseNets [68], ResNets [53], and multitask learning approaches [115] have been used to improve classification performance in various medical imaging applications, including the detection of cancerous lesions, identification of diseases, and assessment of treatment response. Figure 6 indicates the application of DL approach to mammography images.
Key challenges in DL-based image classification include the limited availability of labeled data, class imbalance, and the need for model interpretability. Future research may focus on leveraging unsupervised or semisupervised learning techniques [116], data augmentation strategies [117], and advanced regularization techniques [118] to overcome these challenges. Moreover, developing methods to provide meaningful explanations for model predictions and incorporating domain knowledge into DL models may enhance their clinical utility [119].
Image reconstruction is a fundamental step in many medical imaging modalities, such as CT, MRI, and PET, where raw data (e.g., projections, k-space data) are transformed into interpretable images [120]. DL has shown potential in improving image reconstruction quality and reducing reconstruction time [121]. CNNs have been used for image denoising, super-resolution, and artifact reduction in various imaging modalities [122,123]. Additionally, DL-based iterative reconstruction techniques [124] and the integration of DL models with conventional reconstruction algorithms [125] have been proposed to optimize image quality while reducing radiation dose or acquisition time. Figure 7 presents the application of GAN-based PET image reconstruction.
Challenges in DL-based image reconstruction include the need for large-scale training data, the limited generalizability of the models across different imaging devices and acquisition protocols, and the potential for introducing new artifacts or biases into the reconstructed images [126]. Future research may focus on developing techniques to leverage limited training data, such as unsupervised or self-supervised learning methods [127], and designing more robust models that can generalize across different imaging conditions [128]. Furthermore, ensuring the safety and reliability of DL-based reconstruction methods by quantifying their uncertainties and validating their performance on large, diverse datasets will be crucial for their clinical adoption. Figure 8 is a typical example of DL-based medical image registration.
Image registration is the process of aligning two or more images, often acquired from different modalities or at different time points, to facilitate comparison and analysis [129]. DL has been increasingly applied to image registration tasks, with CNNs and spatial transformer networks (STNs) being the most commonly used architectures [130]. Supervised learning approaches, such as using ground-truth deformation fields or similarity metrics as labels, have been employed to train deep registration models [131]. Moreover, unsupervised learning techniques, which do not require ground-truth correspondences, have been proposed to overcome the challenges of obtaining labeled data for registration tasks [132].
DL-based image registration faces challenges such as the need for large, diverse training datasets, the limited interpretability of the learned transformations, and the potential for overfitting or generating implausible deformations [70]. Future research may focus on developing more efficient and flexible DL architectures for registration, incorporating domain knowledge into the models, and designing robust evaluation metrics that can capture the clinical relevance of the registration results [133]. Additionally, leveraging multitask learning [134] and transfer learning approaches [135] may help improve the generalization and performance of deep registration models in various medical imaging applications.
Medical imaging modalities, such as magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), ultrasound, and optical coherence tomography (OCT), have unique characteristics and generate different types of images. Therefore, DL techniques need to be tailored to each modality to achieve optimal performance. In this section, we will discuss the current state-of-the-art DL techniques and applications for each modality, as well as the challenges and future directions.
Before diving into the application of DL in specific imaging modalities, it is important to clarify the focus of this section. The intention is to discuss how DL is applied in the analysis of images generated by these different modalities, such as MRI, CT, PET, ultrasound imaging, and OCT, rather than its application in the process of image acquisition. Specifically, the discussion will center around how DL has been utilized to extract meaningful insights from these images, for example, through tasks such as segmentation, classification, detection, and prediction. This includes the ability to identify and classify pathologies, measure anatomical structures, and even predict treatment outcomes.
MRI is a noninvasive medical imaging modality that provides detailed structural and functional information. It has been widely used in diagnosis, treatment planning, and monitoring of various diseases. DL techniques have been applied to various tasks in MRI, including image segmentation, image registration, image synthesis, and disease classification.
CNNs have been widely used in MRI analysis tasks. For instance, U-Net [6] has been used for MRI segmentation tasks, such as brain tumor segmentation [136] and prostate segmentation [137]. Similarly, residual networks (ResNets) [53] have been used for MRI reconstruction [119] and disease classification [110]. RNNs have also been used for MRI analysis, such as brain tumor segmentation [138]. GANs have also been used for MRI applications, such as image synthesis and image-to-image translation. For example, GANs have been used for the synthesis of brain MRI images [104] and for the generation of CT images from MRI images [124]. GANs have also been used for image denoising [139] and super-resolution [140] in MRI. Figure 9 shows the application of MRI brain image segmentation.
Despite the promising results, there are still challenges in applying DL techniques to MRI analysis. One of the major challenges is the limited availability of large, annotated datasets. Moreover, the heterogeneity of MRI data, such as differences in image contrast, image resolution, and imaging protocols, makes it difficult to generalize DL models to new datasets. Therefore, developing transferable models that can handle these variations is an important future direction. Additionally, incorporating domain-specific knowledge and incorporating prior information into DL models can further improve their performance.
CT is a widely used medical imaging modality that provides detailed anatomical information. It is commonly used in the diagnosis and treatment planning of various diseases, such as cancer and cardiovascular diseases. DL techniques have been applied to various tasks in CT, including image segmentation, disease detection, and diagnosis.
CNNs have been widely used in CT analysis tasks. For example, Mask R-CNN [141] has been used for lung nodule detection in CT images [142]. CNNs have also been used for CT image registration [143] and image segmentation [144]. Moreover, DL techniques have been applied to CT angiography for vessel segmentation and centerline extraction [145]. In addition to CNNs, GANs have also been used in CT applications, such as image denoising [146] and image synthesis [147]. For example, GANs have been used for synthesizing low-dose CT images from high-dose CT images [148]. Figure 10 indicates CT image classification using DL techniques.
One of the challenges in applying DL techniques to CT analysis is the limited availability of annotated datasets. Moreover, CT images contain high levels of noise, which can affect the performance of DL models. Therefore, developing DL models that are robust to noise is an important future direction. Moreover, developing transferable models that can handle variations in imaging protocols and patient populations is also an important future direction.
PET is a medical imaging modality that is used for functional imaging. It is commonly used in cancer diagnosis and treatment planning. DL techniques have been applied to various tasks in PET, including image segmentation, image reconstruction, and disease classification. Figure 11 is a typical example of PET image segmentation using a DL-based method.
CNNs have been widely used in PET image analysis tasks. For example, U-Net [6] has been used for PET image segmentation [149]. Moreover, GANs have been used for PET image reconstruction [150] and image denoising [151]. Additionally, DL techniques have been applied to PET image registration [152] and disease classification [140].
One of the challenges in applying DL techniques to PET is the limited availability of annotated datasets, particularly for rare diseases. Moreover, PET images suffer from low spatial resolution and high noise levels, which can affect the performance of DL models. Therefore, developing robust DL models that can handle these challenges is an important future direction. Additionally, developing transferable models that can handle variations in imaging protocols and patient populations is also an important future direction.
Ultrasound is a medical imaging modality that uses high-frequency sound waves to produce images of the internal organs and tissues. It is commonly used in obstetrics, cardiology, and urology. DL techniques have been applied to various tasks in ultrasound imaging, including image segmentation, disease classification, and image denoising. Figure 12 presents one example of fetal head detection in ultrasound images using convolutional neural networks.
CNNs have been widely used in ultrasound image analysis tasks. For example, U-Net [6] has been used for segmentation of the fetal brain in ultrasound images [153]. Moreover, RNNs have been used for tracking the fetal brain in ultrasound videos [154]. Additionally, DL techniques have been applied to ultrasound elastography for tissue characterization [155].
One of the challenges in applying DL techniques to ultrasound imaging is the limited availability of annotated datasets, particularly for rare diseases. Moreover, ultrasound images are prone to artifacts and noise, which can affect the performance of DL models. Therefore, developing robust DL models that can handle these challenges is an important future direction. Additionally, developing transferable models that can handle variations in imaging protocols and patient populations is also an important future direction.
OCT is a medical imaging modality that uses light waves to produce images of biological tissues. It is commonly used in ophthalmology for imaging the retina and the optic nerve. DL techniques have been applied to various tasks in OCT imaging, including image segmentation, disease classification, and image registration. Figure 13 is the workflow of OCT image angiography using a DL-based approach.
CNNs have been widely used in OCT image analysis tasks. For example, a fully convolutional network (FCN) has been used for segmentation of retinal layers in OCT images [156]. Moreover, DL techniques have been applied to OCT angiography for vessel segmentation and centerline extraction [6]. Additionally, RNNs have been used for tracking the movement of retinal layers in OCT videos [44].
One of the challenges in applying DL techniques to OCT imaging is the limited availability of annotated datasets, particularly for rare diseases. Moreover, OCT images suffer from speckle noise and low signal-to-noise ratio, which can affect the performance of DL models. Therefore, developing robust DL models that can handle these challenges is an important future direction. Additionally, developing transferable models that can handle variations in imaging protocols and patient populations is also an important future direction.
DL techniques for medical imaging have shown impressive performance in various tasks, including image segmentation, classification, reconstruction, and registration. To evaluate the performance of these methods, appropriate metrics and benchmarks are needed.
Various metrics have been proposed to evaluate the performance of DL methods for medical imaging. For image segmentation tasks, commonly used metrics include Dice coefficient, Jaccard index, and surface distance measures [157]. For image classification tasks, metrics such as accuracy, precision, recall, and F1 score are often used [158]. For image reconstruction tasks, peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) are commonly used metrics [159]. In addition, some studies have proposed novel metrics specific to certain applications, such as registration accuracy and tumor size measurement in cancer imaging [160]. It is important to note that no single metric can fully capture the performance of a DL method, and a combination of metrics should be used for comprehensive evaluation. Moreover, the choice of metrics should depend on the specific application and clinical relevance.
In the realm of medical imaging, various deep learning (DL) methods have been applied and compared in terms of their performance. For instance, a notable comparative study by Zhang et al. [161] explored the effectiveness of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) in detecting tumors from lung CT scans. The study revealed that both models yielded commendable results, but CNNs outshined RNNs with an accuracy rate of 92% compared to 89%. Furthermore, the CNN model demonstrated superior sensitivity and specificity, underscoring the potential advantages of CNNs in tasks involving medical imaging. Another insightful comparison was presented by Patel et al. [162], which contrasted the performance of deep belief networks (DBNs) and CNNs for the detection of breast cancer using mammograms. Although both models achieved impressive accuracy rates, the DBN demonstrated a superior area under the receiver operating characteristic (ROC) curve (AUC), scoring 0.96 compared to the CNN’s 0.92. This finding suggests that DBNs might offer an edge over CNNs in distinguishing between malignant and benign cases in mammography.
Publicly available datasets and competitions play a critical role in advancing DL research for medical imaging. These resources provide standardized data and evaluation protocols for comparing different methods and fostering collaboration among researchers. There are various publicly available datasets for different medical imaging modalities, such as the Cancer Imaging Archive (TCIA) for CT and MRI, the Alzheimer’s Disease Neuroimaging Initiative (ADNI) for MRI [163], and the Retinal OCT (ORIGA) dataset for OCT [164]. In addition, several competitions have been organized to benchmark the performance of DL methods for medical imaging, such as the International Symposium on Biomedical Imaging (ISBI) challenge [165] and the Medical Segmentation Decathlon [166]. However, the availability and quality of publicly available datasets and competitions can vary across different medical imaging modalities and tasks. Moreover, some datasets may have limited diversity in terms of patient populations and imaging protocols, which can affect the generalizability of the results. To address these issues, it is important to establish standards and guidelines for dataset curation and evaluation protocols. Collaborative efforts among researchers, clinicians, and industry partners are needed to ensure the availability and quality of publicly available datasets and competitions for DL research in medical imaging.
In recent years, the rapid development and widespread use of DL techniques in medical imaging have raised a number of ethical considerations, ranging from data privacy and security to bias and fairness, explainability and interpretability, and integration with clinical workflows. In this section, we discuss some of these issues and their potential impact on the future of DL in medical imaging.
One of the main ethical concerns associated with DL in medical imaging is the need to protect patient data privacy and ensure data security. Medical images contain sensitive information about patients, and their unauthorized use or disclosure could have serious consequences for their privacy and well-being. Therefore, it is essential to implement appropriate measures to protect the confidentiality, integrity, and availability of medical images and associated data. Several studies have proposed various methods for enhancing data privacy and security in medical imaging, including encryption, anonymization, and secure data sharing protocols [167]. These methods can help to protect patient data privacy and reduce the risk of data breaches or cyberattacks.
Another important ethical consideration in the use of DL in medical imaging is the risk of bias and unfairness. DL models are trained on large datasets, and if these datasets are biased or unrepresentative, the resulting models can perpetuate or amplify these biases, leading to unfair or inaccurate predictions [168]. Several studies have highlighted the issue of bias in medical imaging datasets, such as disparities in the representation of certain demographic groups [169]. To address these issues, researchers have proposed various approaches, such as data augmentation, data balancing, and fairness-aware training [77]. These methods can help to mitigate bias and improve the fairness of DL models.
The black-box nature of DL models is another ethical concern in medical imaging, as it can make it difficult to understand how they arrive at their predictions, and to identify potential errors or biases [170]. This lack of transparency and interpretability can limit the usefulness of DL in clinical settings, where explainability and interpretability are critical for building trust and confidence among healthcare providers and patients. To address these issues, researchers have proposed various methods for enhancing the explainability and interpretability of DL models, such as attention mechanisms, saliency maps, and counterfactual explanations [44]. These methods can help to improve the transparency and interpretability of DL models and facilitate their integration into clinical workflows.
The integration of DL into clinical workflows is another important consideration in the use of DL in medical imaging. To be clinically useful, DL models must be integrated into clinical workflows in a way that is efficient, reliable, and effective [171]. This requires careful consideration of various factors, such as the availability and accessibility of data, the quality and relevance of predictions, and the impact on clinical decision-making. Several studies have proposed various methods for integrating DL into clinical workflows, such as decision support systems, clinical decision rules, and workflow optimization [172]. These methods can help to streamline the use of DL in clinical settings and improve the efficiency and effectiveness of clinical decision-making.
Looking forward, there are several key areas for future research in the use of DL in medical imaging. These include the following: (1) Developing more robust and accurate DL models that can handle variations in data quality and heterogeneity. (2) Enhancing the interpretability and explainability of DL models to facilitate their integration into clinical workflows. (3) Addressing ethical considerations, such as data privacy and security, bias and fairness, and regulatory compliance. (4) Investigating the potential of using DL in combination with other modalities, such as genomics, proteomics, and metabolomics, to improve the accuracy and specificity of medical imaging diagnoses. (5) Exploring the use of DL in personalized medicine, where models can be trained on patient-specific data to provide tailored treatment recommendations. (6) Developing methods for ensuring the robustness and generalizability of DL models across different populations and clinical settings. (7) Investigating the potential of using DL to automate the entire medical imaging pipeline, from acquisition to analysis to interpretation.
In conclusion, DL techniques have shown great promise in the field of medical imaging, with a wide range of applications and potential benefits for patient care. However, their use also raises important ethical considerations, such as data privacy and security, bias and fairness, and explainability and interpretability. Addressing these issues will be critical to realizing the full potential of DL in medical imaging and ensuring that its benefits are equitably distributed. Future research should focus on developing more robust and accurate models, enhancing their interpretability and explainability, and exploring new applications and use cases for DL in medical imaging. Moreover, it is important to collaborate with healthcare providers, patients, and other stakeholders to ensure that the development and use of DL models in medical imaging align with their needs and priorities. This includes involving patients in the design and evaluation of DL models and ensuring that the benefits of these models are accessible to all, regardless of socioeconomic status, race, or ethnicity. In addition, regulatory frameworks must be established to ensure that DL models meet ethical and quality standards and that their use is transparent and accountable. This includes developing guidelines for data privacy and security, bias and fairness, and explainability and interpretability, as well as establishing standards for model validation and performance evaluation. Overall, DL has the potential to revolutionize the field of medical imaging and transform the way we diagnose and treat diseases. However, its success will depend on addressing the ethical and technical challenges that come with its use and on developing a collaborative and patient-centered approach to its development and implementation. With continued research and innovation, DL is poised to make a significant contribution to the advancement of healthcare and improve the lives of patients around the world.
In this review article, we provided a comprehensive analysis of DL techniques and their applications in the field of medical imaging. We discussed the impact of DL on disease diagnosis and treatment and how it has transformed the medical imaging landscape. Furthermore, we reviewed the most recent DL techniques, such as CNNs, RNNs, and GANs, and their applications in medical imaging.
We explored the application of DL in various medical imaging modalities, including MRI, CT, PET, ultrasound imaging, and OCT. We also discussed the evaluation metrics and benchmarks used to assess the performance of DL algorithms in medical imaging, as well as the ethical considerations and future perspectives of the field.
Moving forward, the integration of DL with medical imaging is expected to continue revolutionizing the diagnosis, treatment, and management of diseases. The development of more advanced algorithms, coupled with the ever-increasing availability of medical imaging data, will undoubtedly contribute to significant advancements in healthcare. However, the medical community must also address the various challenges and ethical considerations that arise in the application of DL, such as data privacy, security, bias, and interpretability, to ensure that the technology is responsibly harnessed for the betterment of patient care.
Overall, DL in medical imaging holds great promise for improving healthcare outcomes and advancing the field of medicine. As the technology continues to evolve, it is essential for researchers, clinicians, and other stakeholders to work collaboratively to overcome challenges, address ethical concerns, and fully realize the potential of DL in medical imaging.
Conceptualization, H.Z. and Y.Q.; methodology, H.Z. and Y.Q.; software, H.Z. and Y.Q.; validation, H.Z. and Y.Q.; formal analysis, H.Z. and Y.Q.; investigation, H.Z. and Y.Q.; resources, H.Z. and Y.Q.; data curation, H.Z. and Y.Q.; writing—original draft preparation, H.Z. and Y.Q.; writing—review and editing, H.Z. and Y.Q.; visualization, H.Z. and Y.Q.; supervision, H.Z.; project administration, H.Z. and Y.Q.; funding acquisition, H.Z. All authors have read and agreed to the published version of the manuscript.
This work was funded in part by the National Natural Science Foundation of China under Grant 62127812, 61971335.
Figure 1. A comparison of various medical imaging modalities. Scinti: Scintigraphy; SPECT: Single-Photon Emission Computed Tomography; Optical: Optical Imaging; PET: Positron Emission Tomography; CT: Computed Tomography; US: Ultrasound; MRI: Magnetic Resonance Imaging.
Figure 1. A comparison of various medical imaging modalities. Scinti: Scintigraphy; SPECT: Single-Photon Emission Computed Tomography; Optical: Optical Imaging; PET: Positron Emission Tomography; CT: Computed Tomography; US: Ultrasound; MRI: Magnetic Resonance Imaging.