Automated lung tumor detection and diagnosis in CT Scans using texture feature analysis and SVM

—CT scans are an important tool in the diagnosis of lung tumors in medicine. This work presents an automated system for lung tumor diagnosis on CT scans. Scans are automatically segmented using marker-based watershed transformation, which successfully segments hardly separable, lung wall adjunct tumors. The scans are further analyzed in a sliding window approach using Haralick features and a Support Vector Machine classiﬁer to detect and classify benign and malignant tumors. This novel approach for classiﬁcation was tested using the LUNGx Challenge dataset [1] and achieved exceptional results while utilizing a minimal training set.


I. INTRODUCTION
C CANCER is still one of the most frequent causes of death worldwide [2].Lung cancer is in the course of this the leading cause of cancer deaths for men as well as one of the most common cancers diagnosed in woman [3].An early diagnosis is important, as it can influence the choice of treatment and thus prolong the patient's life.A widely used diagnostic method is the analysis of computed tomography (CT) scans of the lung.Recent work has shown that selected texture features can be used on CT scans to distinguish between benign and malignant pulmonary nodules [4], [5], [6].The presented approach introduces a workflow that automatically identifies and classifies tumor tissue in lung CT scans by extracting Haralick texture features [7] and classifying image regions in a sliding window approach using a Support Vector Machine (SVM) classifier.

II. BACKGROUND
For a systematic detection of tumors in lung CT scans several sub-problems have to be considered.We will discuss them together with their state of the art solutions.The general procedure and methodology of a tumor diagnosis system can be subdivided into the following four sub-problems [8]: 1) Preprocessing: The goal of preprocessing is the reduction of unwanted artifacts and noise often occurring in CT scans.The preprocessing step facilitates the further processing of the image and may also be used to enhance certain image features for later processing.2) Segmentation: Segmentation is used to separate semantically coherent image areas.It is a crucial step in order to achieve a successful classification, because the segmentation result significantly influences the results of the following processing steps.3) Feature Extraction: This step uses algorithms to extract selected features from the image.Lung tumors often differ in size, texture or contour.4) Classification: After the feature extraction, each identified region is evaluated based on its characteristics.
Based on the rating of the chosen classifier, images or image areas may be assigned to a positive or negative class.
Based on these sub-steps a system for tumor recognition and classification can be created from a combination of different approaches that are capable of solving one or more of these subproblems.For segmentation, feature extraction and classification numerous different methods can be utilized.
Recent works concentrating on texture features for cancer analysis archive promising classification results using an SVM classifier for tumor detection and evaluation.A recent work by Nilesh Bhaskarrao Bahadure et al. [9] shows the impact of texture features in combination with an SVM classifier for tumor detection in brain MRI scans.They achive an accuracy value of 96.51%.The use of texture features for tumor detection has also been studied in the area of lung CT scans by several research groups: • Zayed and Elnemr [4] study the effectiveness of Haralick texture features on the identification of lungs with malign pulmonary nodules.For segmentation, the lung with the largest volume is mirrored and used as a mask for the second lung to separate tumors inter-grown with the lung wall.They conclude that selected texture features could be useful for the detection of abnormalities in CT lung scans.Although this approach may detect abnormal lungs, the position of the abnormal tissue within the diseased lung cannot be determined.results for differentiation between malign and benign nodules (AUC of 92.7%).A segmentation step for a complete tumor recognition system is missing, since the precise tumor position has already been determined by the radiologist.Thus, the work does not provide a system that can automatically segment and classify tumors without the preliminary work of a specialist.
• Zhao et al. [6] present a complete workflow that implements automatic segmentation and separation of tumor tissue using thresholding and morphological operations, without prior knowledge of tumor positions.They achieve an accuracy between 86.8% -93.9%.They classify tumors of 3 different predefined size groups.This presupposes in turn a manual division of the data, which facilitates the segmentation problem by a possible reduction of parameters.The proposed approach is not fully automated, as for a functioning segmentation information on the position or size of the tumor must be given in advance.
We suggest a system for automated tumor detection, which implements all sub-steps (preprocessing, segmentation, feature extraction and classification) without relying on previous knowledge in terms of tumor type, position or size.Our system is able to automatically detect tumors in lung CT scans and classify them as benign or malignant.In addition to this fully automated approach, we provide a user interface to evaluate results independently, set markers to optimize segmentation results and to select fixed cutouts for classification.We will evaluate our novel approach using a data set from the SPIE-AAPM Lung CT Challenge [10], [11], [1], which consists of CT scans of 70 patients of different age groups with a slice thickness of 1 mm.For each patient, the scans contain one or more either benign or malignant lung tumors identified by follow-up examinations or pathological assessments by experts.Both the position of the tumor center and the classification into benign or malignant were annotated.The dataset is divided into a calibration dataset containing 10 patients, as well as a test dataset, which covers the remaining 60 patients.The calibration dataset contains CT scans with exactly 5 benign and 5 malignant tumors.The test dataset consists of a total of 73 sections with 36 malignant and 37 benign tumors.The size of the tumors in the dataset varies widely with small tumors being less than 3 mm in diameter and large tumors larger than 35 mm in diameter.The difficulty level of the dataset for the evaluation of benign and malignant tumors can be classified as very demanding.Out of all 11 approaches submitted in this challenge, only 3 achieved an AUC score significantly better than random guessing [10].The AUC values of radiologist assessments ranged between 0.7 and 0.85.The best participant scored an AUC value of 0.68, see [10].

III. METHODS
This novel approach follows the results of Zayed and Elnemr [4], Han et al. [5], and Zhao et al. [6] and uses texture features for feature extraction.An SVM is used for classification as it proved to be more successful than other classifiers including neural networks, see Bahadure et al. [9].The tumor detection is based on multiscale sliding windows, since this method is independent of the size of the searched object.In a previous segmentation step, the lungs are extracted to reduce the search area.This improves the results and reduces the computing time.For the segmentation the marker-based watershed transformation, as supposed by Kulkarni et al. [8], is used.This method can also be used to separate tumors that have grown into the lung wall.The markers are computed in a preliminary step using morphological operations.In addition to the introduced state of the art works, we present a complete system for tumor detection and diagnosis that performs all necessary steps for a tumor recognition system described above and is independent of the tumor size.
Our approach can be divided into three steps: 1) Preprocessing and Segmentation: In this first step, the lungs are separated from the rest of the tissues and the image background.Using morphological operations and marker-based watershed transformation (WST), the image background and the ribcage are removed.

2) Tumor detection:
Using a sliding window approach, texture features are evaluated on different scales to compute heat maps, which are used to identify tumor structures.The heat maps indicate image areas which contain potentially tumor-like textures without providing information about the malignity of tumors yet.For later evaluation the user can either use this information to independently mark structures for evaluation or automatically select all interesting image areas for the evaluation based on the prior calculated heat map.3) Tumor classification: Benign and malignant nodules can again be differentiated by certain texture features as proposed by [4], [5], [6].The areas selected by the heat map or manually selected by the user are again evaluated by an SVM using texture features.The complete workflow is shown in figure 1.The individual steps in the diagnosis process are described individually as such in the following sections.
For the purpose of training and evaluation, we inspected the slices containing the tumor center for all ten patients in the calibration set.We receive one or more grayscale images of 512 × 512 pixels per patient, depending on how many tumor centers have been annotated.The SVM was trained using the calibration set provided by the challenge.We evaluated our results using the test data set consisting of a total of 73 images.The test set contained 30 images in which tumors were already fused with the lung wall.In the remaining 43 test images, the tumors were isolated inside of the lung.All CT images shown in this paper are either taken from the dataset provided by the challenge [1] or amended by our presented diagnostic system.

Preprocessing and Segmentation
The segmentation of the CT scans isolates the internal lung tissue and facilitates the detection and classification of the tumors.Large areas of the image are removed in this process and do not need to be considered for later computation.The image background followed by the thorax is removed in two steps, using binarization, erosion, connected component labeling and the marker-based watershed transformation.These operations were implemented using the OpenCV library (see [12] or https://opencv.org/).In addition to a fully automated approach, users are also provided with an interactive mode.Here users can set markers for an automatic separation of tissue based on marker positions which may improve the segmentation results.
CT scans often contain artifacts in the image background.These can be a barrier for further processing and segmentation because they represent separate components that are also recognized as such by the connected component algorithm, even though they are not part of the tissue that should be examined.These artifacts are removed from the image background in a first preprocessing step.
In several pictures, the chest adjoins the outer edge of the picture, creating two separate background areas in the upper and lower part of the picture.To prevent this and create a coherent image background, a margin of 5 pixels is set for the outer left and right sides of the image.Image noise is removed using a median filter to improve further processing.From the filtered image, a binary image is calculated using a threshold intensity value of 130.The connected component algorithm is applied to this binary image to identify the largest foreground region as the image background, which is removed in a new binary image.By erosion, all artifacts in the image background can now be removed until only the ribcage is left as a single foreground component.For this purpose, we erode the binary image with an increasingly bigger quadratic kernel starting with a kernel size of 1.After each iteration, the number of foreground components is checked using the connected component algorithm.This step is repeated until only a single component, namely the rib cage, is detected.The remaining background is now applied as an image mask to the input image.The result is the isolated body scan without the image background.The whole procedure is shown in figure 2.
In the second step, the thorax should be removed without removing any attached tumor structures inside the lung tissue.For this purpose, the output image of the last step is further used as input.The input image is smoothed and binarized as described above.The binary image is then eroded to separate possible tumor structures that are internally connected to the thorax.For the Erosion, a 13×13 pixel kernel is used.Following systematic tests, this kernel size has proven to be optimal in order to successfully separate as many tumor structures of the test data set as possible from the lung wall.Using connected component labeling, all isolated structures in the eroded binary image are now identified and saved as markers.These markers are then used in a marker-based watershed transformation on the binary image to separate the rib cage from the inner tissue.The different markers spread to all foreground pixels of the previously created binary image.The labeled area with the largest volume is identified as the chest area and is removed from the original image like previously the background.In the finished segmented result image only the two lungs remain.The methodology is illustrated in figure 3.
With the proposed methodology, it is not possible to separate all tumors that are connected to the lung wall.It is necessary that the diameter of the junction between tumor and lung wall is smaller than the total diameter of the tumor.Otherwise the tumor can not be completely separated from the lung wall by erosion without entirely removing it.As a result, no marker for the WST can be obtained and the tumor assigned to the segment of the thorax after the WST step and is completely removed together with the thorax in the following step.Using the proposed method on the test data 18 out of 30 tumors that were connected to the lung wall can be successfully separated automatically.This corresponds to a loss of 16.44% of all tumors to be segmented in relation to the entire data set of 73 images.If markers are placed manually at critical locations before segmentation 100% of the test images can be successfully segmented.

Tumor Detection
The recognition of tumors on the basis of texture features can be difficult if the size of the tumor is unknown.Considering windows of different sizes, texture features may have different values for the same image coordinate.This is due to the fact that the texture of the window contents changes significantly with different window sizes.Even if a trained classifier gives a positive response to the texture features of the correct window size, the response may be negative if the window is too large or too small.To localize the tumor structures, our approach utilizes sliding windows of 11 different scales ranging from 29 × 29 pixels to 9 × 9 pixels, which correspond to the maximum and minimum size of all tumors found in the training set.The sliding window iterates through the image from the top left to the bottom right corner.Texture features are extracted from each window.An SVM uses these features to calculate a score and assign the window to a positive or negative class.In a result matrix, the entry corresponding to the central coordinate of the current window is increased if the respective section is assigned to the tumor class.The entry is additionally scaled with the SVM score of the respective window to assign more weight to windows that receive a high rating by the SVM.After generation of the last weighted result matrix, all result matrices are concatenated and the resulting matrix is normalized to a maximum intensity of 255.The resulting matrix is used to create a heat map that identifies image areas with tumor-like texture.
The complexity of the heat map calculation is in O(11nm) = O(n 2 ), where n is the number of image lines, m is the number of image columns, and n = m.
For classification purposes, this work uses the SVM-light implementation of Joachims (see [13] or http://svmlight.joachims.org).For the training of the SVM, image sections with a size of 9×9 pixels to 29×29 pixels from the 10 images of the calibration dataset were generated for the positive class on the basis of the annotated tumor centers of the dataset per patient for each possible scale.For the negative class, 20 random cutouts of a random size between 9 × 9 and 29 × 29 pixels were selected from the rest of the image.Sections for the negative class which contained a tumor center were discarded and regenerated.For all sections, JFeatureLib (see [14] or https://github.com/locked-fg/JFeatureLib)was used to extract feature vectors with texture features that were used to train the SVM.For the training of the SVM an RBF kernel with a σ value of 10 −8 was used.The best kernel and optimal parameters were experimentally determined by optimizing the accuracy on the training data.The accuracy was calculated using leave-one-out cross-validation.
The heat maps generated in the previous step describe regions, whose texture is most similar to those of tumors.From these regions, excerpts are taken for evaluation in the last step.For this, the minimum bounding box of each isolated area of the heat map is calculated.Boxes that are smaller than 5 × 5 pixels are discarded because the smallest tumors already have a diameter of at least 9 pixels.Based on the calculated bounding box, the central coordinate of each area is determined.These central coordinates are then used to find the bounding box of the respective component in the input image which corresponds to the area of the heat map.Since the bounding box of the heat map does not always correspond to the full size of the respective components in the input image, this approach has the advantage that the new bounding box fully covers the components in the original image and can thus optimally describe the texture of the respective components.
The various components are already separated by WST and provided with a unique label connected to the prior segmentation step.Based on the previously determined central coordinate, the label and the associated component can be determined.Based on this information, a new bounding box can be calculated which corresponds to the component in the input image.For each window generated, the SVM again calculates a score which is intended to reflect the probability that a tumor is present in the respective window.

IV. RESULTS
The tumor detection was evaluated using 2 different strategies: Strategy 1: All found windows were treated as tumors.This approach has the advantage of minimizing the number of false negative results.However, as many textures can be recognized as a tumor in some images, the number of false positives also increases significantly.
Strategy 2: Only the window with the highest SVM score is considered.The advantage of this strategy is that as many false positives as possible can be excluded.The disadvantage is that true positives can also be rejected as false negatives.This would be fatal, especially in the case of an actual diseased An example of a scan in which more than a single tumor is found is shown in figure 6.The bounding boxes of the two identified components are determined by the heat map in the original image and evaluated again by the SVM.In Figure 6, the right calculated bounding box contains a correctly recognized tumor.The left bounding box includes a component that has been erroneously recognized as a tumor.The SVM score of the correctly recognized tumor segment is with 3.12 higher than that of the incorrectly recognized tumor segment with a score of only 0.88.
The results of the window selection are described in table I.For the SPIE-AAPM Lung CT Challenge test dataset, only the tumor centers coordinates are annotated.A window is considered a true positive if it contains the annotated tumor center.All windows that do not contain a tumor center are considered false positives.Tumors that were not detected by a window were considered false negatives.Out of a total of 73 tumors, 59 tumors were detected and 14 tumors were not detected.Of the 59 recognized tumors, 44 had the highest SVM value of any detected windows in each image.Strategy 1 improves the recall by over 20% compared to Strategy 2. However, the precision value is over 40% below the value of Strategy 1. Strategy 2 thus also leads to a higher Fmeasure.Although Strategy 2 performs statistically better than Strategy 1, it should still be viewed critically for practical application.In the case of an actual application, automatically recognized tumors could once again be confirmed or denied by expert knowledge; a false negative would have far worse consequences in such a scenario, as an unrecognized tumor would in any case be a risk for the patient.Fig. 6: Two detected tumors with different SVM ratings.The right window with a rating of 3.12 contains a correctly detected tumor, the left window with a rating of 0.88 contains a falsely detected non-tumor structure.

Tumor classification
In addition to the differentiation between tumor and nontumor tissue, tumor tissue can again be classified as benign and malignant on the basis of the present texture.For this purpose, the image windows previously obtained from the heat map are evaluated by a second SVM.In contrast to the previous step, the textures of the two classes differ only marginally.Feature vectors that have an equal or similar mean in the same dimensions for both classes are difficult to separate using these features.A reduction of the feature space by removing such dimensions, which are very similar or identical for both classes, can increase the accuracy of the classification.Therefore for the distinction of the two tumor classes only those features are used, whose mean values differ significantly for both classes.Qian Zhao et al. [6] already identified homogeneity, energy, correlation and entropy as the most discriminating features in t-tests in order to distinguish between benign and malignant tumors.
In this work, the mean values of the texture features were analogically compared.By utilizing t-tests, p-values were determined for each feature in order to determine the discriminating characteristics.The features were extracted from windows containing tumors for the 10 patients of the calibration dataset.For the training of the SVM and the classification only features with a p-value of less than 0.05 were used to train the classifier.In order to evaluate the classification independently of the preliminary step, the tumor windows for evaluation were determined on the basis of the annotated central coordinates.The SVM was trained on the calibration data set based on the previously determined discriminating features.The performance was evaluated on the basis of the 73 sections of the test data set, as intended by the organizers of the challenge.Based on the results of the t-tests presented in Table II, the texture features Correlation, Variance, Average Sum, Sum Entropy, Entropy, Correlation 2, and Maximum Correlation Coefficient were selected for the training of the SVM.The significance of the features correlation, variance and entropy described in [6] can thus be confirmed.It is not possible to confirm the significance of the contrast and energy characteristics which were rejected on the basis of the calculated p-values.One possible explanation for these different outcomes for the two features would be a different methodology for the tumor window selection.Through the evaluation of the entire bounding box, areas of the adjacent background for the classification were considered in this work.
The SVM achieves the best classification results using the RBF kernel with a σ value of 10 −8 .The kernel with the highest performance and the corresponding optimal parameters were determined experimentally.The test dataset achieved a recall value of 0.75 %, a precision value of 0.5625 %, and an accuracy of 0.589 %.The ROC calculated for the SVM output has an AUC value of 0.61.The ROC is shown in figure 7.

V. DISCUSSION AND FURTHER RESEARCH
In the following section, we will discuss our obtained results and present possible improvements to further enhance our presented methods.We suggest improvements for each individual step which may further increase the accuracy of the presented system.

A. Segmentation
With the presented methodology 83.56% of the test images were successfully segmented.Successful segmentation requires both the separation of lungs and lung wall, as well as the separation of tumors from the lung wall, if they are interconnected.The biggest challenge of the segmentation has been the separation of lung-walled tumors.While the presented methodology was able to correctly segment all lungs with isolated tumors, 12 out of 30 of the lung wall tumors could only be separated by manually placed markers.In the segmentation step, therefore, it has been shown above all that additional user input can be used to improve the efficiency of the system.For our presented approach the prerequisite for a successful automatic segmentation is a maximum width of the connection region between tumor and lung wall.This width must be less than the diameter of the tumor, since a separation by erosion is otherwise impossible.This problem was solved in this work by an active approach with user input by placing markers at critical junctions.A desirable approach would be able to find these markers fully automatically without user input, whereby the segmentation could also be carried out fully automatically for all special cases.This could be realized by a form of edge tracking, which marks the affected image area in the event of a strong change of the gradient direction.

B. Tumor detection
Using texture features, up to 80.82% of the annotated tumors could be successfully detected and localized, provided that all areas of the heat map were considered for detection.However, this strategy also falsely identifies tumors in many areas of the image.If only the area with the highest SVM score was considered per image, 88.19% of these false positives could be eliminated.However, this strategy reduces the recall to 60.67 %.It has thus been shown that an improvement of the recognition accuracy results in a reduction of the recognition rate and vice versa.Future work could build on the results to find methods that eliminate a larger number of false positives without reducing the recall value.

C. Tumor classification
Compared to the other work of the SPIE-AAPM Lung CT Challenge, the proposed methodology has achieved aboveaverage results.The SVM was trained only using the provided calibration set consisting of 10 images.This shows that the presented methodology of classifying texture features by SVM is able to achieve good results even on small training sets.Training with a larger dataset could potentially further improve the classification results.
Currently, only the layer containing the tumor center is used to evaluate the tumor based on its textural features.Fang Han et al. [5] use three-dimensional Haralick features to classify tumors.In their approach, surrounding tissue layers are also considered for the evaluation of the tumor.They increased the AUC value for their dataset to 0.9441 by using threedimensional Haralick features compared to two-dimensional Haralick features which scored an AUC value of 0.9373.In future work, three-dimensional Haralick features could be utilized to possibly further improve the accuracy of the classification.

VI. CONCLUSION
The results of this work have shown that the presented methodologies can be successfully used to implement a complete system for automatic tumor diagnosis.We received and presented very encouraging results.Texture features can still be considered a strong tool for image classification, even in complex applications like tumor recognition and classification.Furthermore our SVM classifier has proven to be very effective in combination with Haralick features, achiving better results than several other classifiers on the same data set.

Fig. 1 :
Fig. 1: Illustration of our complete workflow including segmentation, heatmap calculation and classification of the detected tumor.The red marker is the computed indicator for a malign nodule.

Fig. 2 :
Fig. 2: Removal of the image background: The input image (a) is used to create a binary image (b).Using Connectedcomponent labeling components are determined (c).The biggest component gets removed (d) and the image gets eroded (e).By masking the eroded image on the input we obtain the lung corpus (f).

Fig. 3 :
Fig. 3: Removal of the thorax: The input image (a) is used to create a binary image (b).The binary image is now eroded to separate inter-grown tumors from the lung wall (c).By means of Connected-component labeling, all isolated components are determined (d) and combined with the prior calculated binary image (e).After a marker-based watershed transformation (f) the biggest component is removed (g) followed by the second biggest component (thorax).The resulting image only contains the isolated lung tissue (h).

Fig. 4 :
Fig. 4: Guided segmentation using markers provided by the user.The tumor in the input image (a) could not be automatically segmented.However, if markers are provided by the user (c), the segmentation yields a correct result.

Fig. 7 :
Fig. 7: Calculated ROC curve describing our classification results Of all 11 methods submitted to the challenge, only a total of 2 achieved an AUC value above 0.61.Compared to the submissions that only used the calibration data set consisting of 10 patients for training, the presented work scores second best.Other submissions used the National Lung Screening Trial (NLST) dataset with 53.454 lung scans of former smokers or the Lung Image Database Consortium (LIDC) dataset with lung CT scans of 1,010 different patients.The submission with the highest achieved AUC value used an unspecified in-house dataset.Most submissions use some form of thresholding or region growing for segmentation.The watershed transformation used in this work is not used in any of the submitted papers.In addition to our approach, three submissions of the Challenge use an SVM as a classifier.Of all submissions that use an SVM classifier, two achieved a lower AUC value than the presented work; one work achieved the same AUC value.A submission uses a convolutional neural network (CNN) trained on the LIDC record as a classifier.However, this work only achieved an AUC of 0.59.The best work achieved an AUC of 0.68 using a support vector regressor for classification.

TABLE I :
Evaluation of tumor detection

TABLE II :
Evaluation of p-values for feature selection