Paper ID: 2305.02640

Towards Causal Representation Learning and Deconfounding from Indefinite Data

Hang Chen, Xinyu Yang, Qing Yang

Owing to the cross-pollination between causal discovery and deep learning, non-statistical data (e.g., images, text, etc.) encounters significant conflicts in terms of properties and methods with traditional causal data. To unify these data types of varying forms, we redefine causal data from two novel perspectives and then propose three data paradigms. Among them, the indefinite data (like dialogues or video sources) induce low sample utilization and incapability of the distribution assumption, both leading to the fact that learning causal representation from indefinite data is, as of yet, largely unexplored. We design the causal strength variational model to settle down these two problems. Specifically, we leverage the causal strength instead of independent noise as the latent variable to construct evidence lower bound. By this design ethos, The causal strengths of different structures are regarded as a distribution and can be expressed as a 2D matrix. Moreover, considering the latent confounders, we disentangle the causal graph G into two relation subgraphs O and C. O contains pure relations between observed variables, while C represents the relations from latent variables to observed variables. We implement the above designs as a dynamic variational inference model, tailored to learn causal representation from indefinite data under latent confounding. Finally, we conduct comprehensive experiments on synthetic and real-world data to demonstrate the effectiveness of our method.

Submitted: May 4, 2023