To the best of our knowledge, this work introduces the first framework for clustering longitudinal data by leveraging time-dependent causal representation learning. Clustering longitudinal data has gained significant attention across various fields, yet traditional methods often overlook the causal structures underlying observed patterns. Understanding how covariates influence outcomes is critical for policymakers and business leaders seeking actionable and interpretable insights. Although causal discovery models have advanced from static to time-series frameworks, their integration with longitudinal data remains underexplored. To address this limitation, we propose CLOUD-CG (Clustering on Longitudinal Causal Graphs), a method that leverages Temporal Directed Acyclic Graphs (T-DAGs), to cluster longitudinal data based on causal mechanisms represented by T-DAGs. CLOUD-CG preserves unit-level heterogeneity, enabling the identification of groups with similar causal structures and delivering interpretable insights. We validate the framework through extensive simulations and demonstrate its practical utility by applying it to deposit data from commercial banks in Mexico. This application reveals how macroeconomic variables causally influence deposits, providing policymakers with a robust tool to monitor and enhance financial stability in emerging markets. Our work contributes to the growing field of clustering and causal discovery within longitudinal data analysis, offering new possibilities for understanding complex, time-dependent relationships across various domains.
Research areas