. 2016 May;35(5):1285-98.

doi: 10.1109/TMI.2016.2528162. Epub 2016 Feb 11.

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R Roth, Mingchen Gao, Le Lu, Ziyue Xu, Isabella Nogues, Jianhua Yao, Daniel Mollura, Ronald M Summers

PMID: 26886976
PMCID: PMC4890616
DOI: 10.1109/TMI.2016.2528162

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin et al. IEEE Trans Med Imaging. 2016 May.

. 2016 May;35(5):1285-98.

doi: 10.1109/TMI.2016.2528162. Epub 2016 Feb 11.

Authors

Hoo-Chang Shin, Holger R Roth, Mingchen Gao, Le Lu, Ziyue Xu, Isabella Nogues, Jianhua Yao, Daniel Mollura, Ronald M Summers

PMID: 26886976
PMCID: PMC4890616
DOI: 10.1109/TMI.2016.2528162

Abstract

Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and deep convolutional neural networks (CNNs). CNNs enable learning data-driven, highly representative, hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models pre-trained from natural image dataset to medical image tasks. In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computer-aided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks.

PubMed Disclaimer

Figures

**Fig. 1.**
Some examples of abdominal and mediastinal lymph nodes sampled on axial (ax), coronal (co), and sagittal (sa) views, with four different fields-of-views (30 mm: orange; 45 mm: red; 85 mm: green; 128 mm: blue) surrounding lymph nodes.

**Fig. 2.**
Some examples of CT image slices with six lung tissue types in the ILD dataset . Disease tissue types are located with dark orange arrows. (a): healthy; (b): emphysema; (c): ground glass; (d): fibrosis; (e): micronodules; (f): consolidation.

**Fig. 3.**
Some examples of CT image patches for (a) Nm, (b) Em, (c) Gg, (d) Fb, (e) MN (f) CD.

formula image — **Fig. 3.**
Some examples of CT image patches for (a) Nm, (b) Em, (c) Gg, (d) Fb, (e) MN (f) CD.

**Fig. 4.**
An example of lung/high-attenuation/low-attenuation CT windowing for an axis lung CT slice. We encode the lung/high-attenuation/low-attenuation CT windowing into red/green/blue channels.

**Fig. 5.**
A simplified illustration of the CNN architectures used. Googlenet contains two convolution layers, three pooling layers, and nine inception layers. Each of the inception layer of googlenet consists of six convolution layers and one pooling layer.

**Fig. 6.**
Illustration of layer of googlenet. Inception layers of googlenet consist of six convolution layers with different kernel sizes and one pooling layer.

**Fig. 7.**
Some examples of cifar10 dataset and some images of “tennis ball” class from imagenet dataset. Images of cifar10 dataset are small (32×32) images with object of the image class category in the center. Images of imagenet dataset are larger (256×256), where object of the image class category can be small, obscure, partial, and sometimes in a cluttered environment.

**Fig. 8.**
FROC curves averaged on three-fold CV for the abdominal (left) and mediastinal (right) lymph nodes using different CNN models.

**Fig. 9.**
Examples of misclassified lymph nodes (in axial view) of both false negatives (left) and false positives (right). Mediastinal LN examples are shown in the upper row, and abdominal LN examples in the bottom row.

**Fig. 10.**
Visual examples of misclassified ILD 64×64 patches (in axial view), with their ground truth labels and inaccurately classified labels.

**Fig. 11.**
Traces of training and validation loss (blue and green lines) and validation accuracy (orange lines) during (a) training alexnet from random initialization and (b) fine-tuning from imagenet pre-trained cnn, for ILD classification.

**Fig. 12.**
Visualization of first layer convolution filters of CNNs trained on abdominal and mediastinal LNs in RGB color, from random initialization (alexnet-RI (256×256), alexnet-RI (64×64), googlenet-RI (256×256) and googlenet-RI (64×64)) and with transfer learning (alexnet-TL (256×256)).

**Fig. 13.**
Visualization of the last pooling layer (pool-5) activations (top). Pooling units where the relative image location of the disease region is located in the image are highlighted with green boxes. The original images reconstructed from the units are shown in the bottom . The examples in (a) and (b) are computed from the input ILD images in Figs. 2(b) and 2(c), respectively.

See this image and copyright information in PMC

Cited by

The Role and Applications of Artificial Intelligence in the Treatment of Chronic Pain.
Meier TA, Refahi MS, Hearne G, Restifo DS, Munoz-Acuna R, Rosen GL, Woloszynek S. Meier TA, et al. Curr Pain Headache Rep. 2024 Jun 1. doi: 10.1007/s11916-024-01264-0. Online ahead of print. Curr Pain Headache Rep. 2024. PMID: 38822995 Review.
Predicting mortality after transcatheter aortic valve replacement using preprocedural CT.
Br��ggemann D, Kuzo N, Anwer S, Kebernik J, Eberhard M, Alkadhi H, Tanner FC, Konukoglu E. Brüggemann D, et al. Sci Rep. 2024 May 31;14(1):12526. doi: 10.1038/s41598-024-63022-x. Sci Rep. 2024. PMID: 38822074 Free PMC article.
Development and external validation of a transfer learning-based system for the pathological diagnosis of colorectal cancer: a large emulated prospective study.
Yuan L, Zhou H, Xiao X, Zhang X, Chen F, Liu L, Liu J, Bao S, Tao K. Yuan L, et al. Front Oncol. 2024 Apr 25;14:1365364. doi: 10.3389/fonc.2024.1365364. eCollection 2024. Front Oncol. 2024. PMID: 38725622 Free PMC article.
Application of convolutional neural networks in medical images: a bibliometric analysis.
Jia H, Zhang J, Ma K, Qiao X, Ren L, Shi X. Jia H, et al. Quant Imaging Med Surg. 2024 May 1;14(5):3501-3518. doi: 10.21037/qims-23-1600. Epub 2024 Apr 11. Quant Imaging Med Surg. 2024. PMID: 38720828 Free PMC article.
Ultrasound deep learning radiomics and clinical machine learning models to predict low nuclear grade, ER, PR, and HER2 receptor status in pure ductal carcinoma in situ.
Zhu M, Kuang Y, Jiang Z, Liu J, Zhang H, Zhao H, Luo H, Chen Y, Peng Y. Zhu M, et al. Gland Surg. 2024 Apr 29;13(4):512-527. doi: 10.21037/gs-23-417. Epub 2024 Apr 11. Gland Surg. 2024. PMID: 38720675 Free PMC article.

See all "Cited by" articles

References

1. Deng J., et al. , “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2009, pp. 248–255.
1. Russakovsky O., et al. , “ImageNet large scale visual recognition challenge,” ArXiv:1409.0575, 2014.
1. LeCun Y., Bottou L., Bengio Y., and Haffner P., “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.
1. Krizhevsky A., Sutskever I., and Hinton G. E., “ImageNet classification with deep convolutional neural networks,” Proc. NIPS, pp. 1097–1105, 2012.
1. Krizhevsky A., Learning multiple layers of features from tiny images, M.S. thesisDept. Comp. Sci.Univ. Toronto, Toronto, Canada: 2009.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

Z01 CL040004-05/Intramural NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Authors

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous