Ultrasound has become one of the most frequently used methods for screening and diagnostic imaging due to the relatively low cost, limited negative side effects and accessibility. Despite this, there are consistent clinical challenges posed through the frequent use of ultrasound due to the lower quality of image, variable operator experience, and inconsistencies from institution to institution. For this reason, it is an appealing target for the development of automatic ultrasound image analysis with the goal of increasing objectivity of the ultrasound studies. Deep learning is a form of machine learning that utilizes raw data (such as ultrasound images) to learn abstract features; this is an enticing concept to apply to ultrasound image analysis due to the wide variability of image interpretation within the field. It has previously been demonstrated as a state-of-the-art tool for automatic analysis when applied to a number of research domains that results in performance improvement. It has already been applied to a number of ultrasound analysis tasks; the universal tasks that pertain widely to different areas of the body are classification, detection and segmentation. These tasks are most successfully completed under a process called “supervised learning,” in which there are image annotations necessary to maximize the capability of the deep learning method. Unsupervised learning, which has the benefit of decreased cost secondary to decreased human labor, has also revealed some promising results. In addition, the deep learning methods in ultrasound analysis may also play a role in overcoming some of the shortcomings posed by the typical two-dimensional ultrasound images, which are meant to represent complex three-dimensional images. This review summarized nearly 100 pertinent papers to offer an overview of the use of deep learning in medical ultrasound analysis.
The deep learning architectures required to be helpful in evaluating ultrasound images are specialized and complex. To accomplish the goals of classification, segmentation or detection, the major classes of deep learning architectures considered are deep discriminative models (i.e. supervised deep networks), deep generative models (i.e. unsupervised deep networks), and hybrid deep networks. Supervised deep networks are presently widely used. This includes both convolutional neural networks (CNNs) and recurrent neural networks (RNNs). CNNs form a standard multi-layer neural network that is generally stacked upon each other, which forms a deep model that allows for the analysis of both spatial and configurational information. This network can be customized to improve computational or time cost at the expense of performance, specific to the desired task. Many architectures derived from CNNs have successfully been developed and applied in other areas, such as NLP and speech recognition. In contrast to CNNs, the use of RNNs can be particularly useful when evaluating ultrasound video sequences, as the length of the input sequences is not limited. Despite this, they have been infrequently utilized due to the difficulty in training the network and have largely been applied to speech or text-recognition tasks.
The unsupervised deep models include auto-encoder (AE) and restricted Boltzmann’s machines (RBMs). AE is a feature-extraction approach that has more success in extracting and conserving information, rather than performing specific tasks utilizing that information. RBMs are essentially auto-encoders, but in practice, are used in combination to form a probabilistic model termed a deep belief network (DBN), which is able to generalize data well when trained with a limited number of labeled samples and a final fine-tuning step prior to unsupervised performance of a task. One of the greatest challenges in training deep models is the requirement of a high number of labeled training samples that must be supplied to the deep learning architecture to provide an acceptable performance of a task. Overfitting is a frequent problem that is faced when training deep learning architectures to analyze ultrasound images. Model optimization and transfer learning strategies have been implemented in an attempt to mitigate this problem. These methods avoid the requirement for costly labelling efforts in the training process.
The application of the deep learning models has been successful with respect to numerous anatomical structures, able to complete the previously mentioned tasks of classification, detection and segmentation. Classification tasks are being completed efficaciously with respect to identifying a number of different tissues, including tumors/lesions, nodules and fetuses. The task of classifying breast tumors, mass lesions, liver masses and thyroid nodules have all shown promise with deep learning methods. It has also been able to effectively act as quality control during fetal ultrasound imaging, especially when evaluating abdominal circumference and cardiac structure. Detection tasks, similar to those of classification, have largely been applied with respect to tumors, lesions and fetal evaluations. The most work thus far has been in the detection of breast tumors/lesions. Detection tasks may also be helpful in confirmation of fetal viability and establishment of gestational age. Segmentation tasks are needed for the analysis of the volume and shape of many organs, which can be performed in both rigid and non-rigid organs. Segmenting the organs of interest allows for the better classification and detection of lesions or other masses. The deep learning methods have largely been focused on 2D image analysis due to the challenges of training a deep learning network on 3D inputs.
In summary, there is already a massive role for deep learning methods in the analysis of medical ultrasound images, but there are still many areas that pose challenges and could use improvement. The primary improvement that could be implemented revolves around larger training sample datasets, which face the unfortunate obstacle of very high costs that can prohibit their consistent use. Training modalities based on artificial, rather than real sampled images, may result in lower performance. Finally, continuing to develop 3D ultrasound imaging modalities will increase the utility of deep learning methods in these areas, as the number of labelled samples would increase drastically. As these challenges are overcome in the future, deep learning methods will continue to play an important role in improving the clinical use of medical ultrasound images.