Correlating assessment scores with performance in role (PIR) metrics provides a powerful form of validation evidence, but is complicated by the absence of PIR metrics for applicants who were not hired. Traditional range restriction perspectives state that the problem is a lack of PIR metrics for low assessment scores, and typical corrections make strong assumptions about how the relationship among incumbents extrapolates to applicants who were not hired (Bryant & Gokhale, 1972; Thorndike, 1947, 1949). If the extrapolation assumptions are strongly violated, however, traditional corrections can over- or under-estimate. This problem is particularly acute when training machine learning models to predict PIR metrics, where overfitting to observed data (incumbents with PIR measures) in a way that does not generalize to unobserved data (candidates without PIR measures) is a fundamental problem (Li et al., 2011; Strehl et al., 2010).
We propose using Inverse Propensity Weighting (IPW) as a simple and accurate method for obtaining correlation estimates that generalize to the candidate population (Lanza et al., 2013; Little, 1986; Rosenbaum, 1988; Rosenbaum & Rubin, 1983, 1984; Seaman & White, 2011; Thoemmes & Ong, 2016). A simulation study confirms that, when case-specific assumptions are violated, traditional corrections are biased and systematically over- or under-estimate the true relationship, and additional data doesn’t help. IPW-based methods, however, make weaker assumptions and exhibit low bias that reduces with data volume on the same simulated data.
Inverse Propensity Weighting for evaluating employee assessments
2024
Research areas