12.5 Summary

Global search methods can be an effective tool for investigating the predictor space and identifying subsets of predictors that are optimally related to the response. Despite these methods having some tunable parameters (e.g., the number of iterations, initial subset size, mutation rate, etc.) the default values are sensible and tend to work well.

Although the global search approaches are usually effective at finding optimal feature sets, they are computationally taxing. For example, the complete process of conducting each of the simulated annealing analyses consisted of fitting a total of 5,501 individual models, and the genetic algorithm analyses required 8,251 models. The independent nature of the folds in external cross-validation allow these methods to be run in parallel, thus reducing the overall computation time. Even so, the GA described above with parallel processing took more than 9 hours when running each external cross-validation fold on a separate core.

In addition, it is important to point out that the naive Bayes model used to compare methods did not optimization any tuning parameters⁹³. If the global search methods were used with a radial basis support vector machine (2 tuning parameters) or a C5.0 model (3 tuning parameters), then the computation requirement would rise significantly.

When combining a global search method with a model that has tuning parameters, we recommend that, when possible, the feature set first be winnowed down using expert knowledge about the problem. Next, it is important to identify a reasonable range of tuning parameter values. If a sufficient number of samples are available, a proportion of them can be split off and used to find a range of potentially good parameter values using all of the features. The tuning parameter values may not be the perfect choice for feature subsets, but they should be reasonably effective for finding an optimal subset.

The smoothness of the kernel density estimates could have been tuned but this does not typically have a meaningful impact on performance.↩