CFA CFA Level 1 Data Mining V Sample Selection Bias

Data Mining V Sample Selection Bias

  • Author
    Posts
    • Avatar of pcunniffpcunniff
      Participant
        • CFA Level 1
        Up
        8
        ::

        Hello!

        Anyone know a good difference between Data mining and Sample selection bias? They seem very similar. Your feedback would be much appreciated!!

        Source: Kaplan

        Patrick

      • Avatar of cfachriscfachris
        Participant
          • CFA Level 3
          Up
          3
          ::

          Hey @pcunniff – so sample selection EXCLUDES subset of the data in the population, so it’s not truly random or representative of the population

          Whereas data mining is just blindly searching for highly correlated patterns/relationships in the dataset (“fitting”), without a proper economic significance to it in the first place.

        • Avatar of pcunniffpcunniff
          Participant
            • CFA Level 1
            Up
            3
            ::

            @cfachris so data mining is NOT sampling. Is that the key diff?? I hate when definitions can be almost the same.

          • Avatar of cfachriscfachris
            Participant
              • CFA Level 3
              Up
              3
              ::

              ugh I know what you mean, it can be confusing.

              Yes, data mining is NOT sampling. Sample selection means exactly that, you CHOOSE the right sample data set (by excluding a subset of data in the population), so it shows biased results and cannot be relied on.

              Data mining on the other hand, includes a relevant dataset, but instead trying to find a random mix of variables that correlate significantly with that dataset, without an economic rationale behind it in the first place.

              Like to give an extreme example, if the weather in a local town somehow correlates highly with a person’s income (economically doesn’t make sense as a theory, but say it is somehow statistically significant in a dataset for a random reason and included). Unless you’re a fisherman perhaps that may make sense…

          Viewing 3 reply threads
          • You must be logged in to reply to this topic.