Two Population Groups

What do we really find in our data?

So far, we have only differentiated the 400 individuals in our dataset by whether they are creditworthy (dark blue) or not (light blue).

Our dataset provides even more information. It includes two population groups, each differing in a key characteristic. This characteristic might be gender, ethnicity, or age (such as older versus younger individuals). In our specific example, we classify individuals according to their fictional origin: “Greenfield” and “Pinkville.”

Each group contains 100 creditworthy and 100 non-creditworthy individuals. Therefore, the probability of being creditworthy is identical in both groups.

Despite this, applying the credit score model leads to distinctly different score distributions for the two groups. This difference is evident in the graphics below.

Distribution of the two population groups
The bank decides to use your "optimal" decision threshold from the previous task for both population groups.
  1. Find as many points of criticism as possible regarding this approach.
  2. Argue, from the bank’s perspective, why this might be a reasonable approach.