Feature selection is based equally upon logic and hit and trial. Logically selecting features is tried first then comes the hit and trial approach.
Selecting features logically includes using the below listed approaches to filter out the un-required features or choose the most dominant one.
- Correlation plot
- Checking for co-linearity among variables
- Selecting variables based on business insight or common knowledge
- Building a linear model to check coefficient values assigned to the model
Once you have logically selected a predefined set of response variables, you can use hit and trial approach to combine, add or remove response variables.
Combining can be beneficial in case the target variable is binary, example being obese, having diabetes, having irregular blood pressure can all be combined together to predict a disease.