2.4 Extraction and Selection of Quantitative Features
After image preprocessing, a number of 1409 quantitative imaging
features were extracted from CT images based on CMP or NP using the
Pyradiomics v.2.1.2 package, and so a total of 2818 features from CMP+NP
were obtained. These features can be grouped into three groups. Group 1
(first order statistics) quantitatively delineates the distribution of
voxel intensities within the CT image through commonly used and basic
metrics. Group 2 (shape- and size-based features) reflects the shape and
size of the region. Calculated from grey level run-length and grey level
co-occurrence texture matrices, textural features that can quantify
region heterogeneity differences were classified into group 3 (texture
features).
As described above, a large number of image features may be computed.
However, all these extracted features may not be useful for a particular
task. Therefore, dimensionality reduction and selection of task-specific
features for best performance are necessary steps. To reduce the
redundant features, the feature selection methods included
the variance threshold (variance
threshold = 0.8), SelectKBest and the least absolute shrinkage and
selection operator (LASSO) were used for this purpose. For the variance
threshold method, the threshold is 0.8, so that the eigenvalues of the
variance smaller than 0.8 were removed. The SelectKBest method, which
belongs to a single variable feature selection method, uses p value to
analysis the relationship between the features and the classification
results, so all the features with p < 0.05 will be used. For
LASSO model, L1 regularizer was used as the cost function, and the error
value of cross validation is 5, and the maximum number of iterations is
1000.