Before diving into methods of feature engineering and feature selection, we must first have a solid understanding of the predictive modeling process. A thorough understanding of this process helps the analyst to make a number of critical choices, most of which occur prior to building any models. Moreover, the more the analyst knows about the data, the mechanism that generated the data, and the question(s) to be answered, the better the initial choices the analyst can make. The initial choices include the model performance metric, approach to resampling, model(s) and corresponding tuning parameter(s). Without good choices, the overall modeling process often takes more time and leads to a sub-optimal model selection.
The next step in the process prior to modeling is to begin to understand characteristics of the available data. one of the best ways to do this is by visualizing the data which is the focus of the next chapter.