::
If you test 100 variables for
predictors with 95% confidence, you would expect 5 of those variables to appear
as predictors even if there is no relationship with the data. The problem, not
the solution, is that hundreds of variables are being tested for statistical
significance. Another data mining issue (intergenerational data mining) can
come about because a researcher uses another researcher’s data. A solution to
both problems involves testing the conclusions
on a new data set.