We propose a new approach to falsify causal discovery algorithms without ground truth, which is based on testing the causal model on a variable pair excluded during learning the causal model. Specifically, given data on X,Y,Z = X,Y,Z1,...,Zk, we apply the causal discovery algorithm separately to the ’leave-one-out’ data sets X,Z and Y,Z. We demonstrate that the two resulting causal models, in the form of causal graphs such as Acyclic Directed Mixed Graphs (ADMGs), often entail conclusions on the dependencies between X and Y and allow to estimate E(Y | X = x) without any joint observations of X and Y , given only the leave-one-out datasets. This estimation is called ”Leave-One-Variable-Out (LOVO)” prediction. Its error can be estimated since the joint distribution P(X,Y ) is available, and X and Y have only been omitted for the purpose of falsification.
We present two variants of LOVO prediction: One graphical method, which is applicable to general causal discovery algorithms, and one version tailored towards algorithms relying on specific a priori assumptions, such as linear additive noise models. Simulations indicate that the LOVO prediction error is indeed correlated with the accuracy of the causal outputs, affirming the method’s effectiveness.
Cross-validating causal discovery via Leave-One-Variable-Out
2025
Research areas