论文标题
在可牵引的概率模型中的干预和反事实:当代转换的局限性
Interventions and Counterfactuals in Tractable Probabilistic Models: Limitations of Contemporary Transformations
论文作者
论文摘要
近年来,通常在机器学习模型中,尤其是生成模型中研究与因果关系相关的特性的兴趣越来越大。尽管这是充分的动机,但它继承了概率推断的基本计算硬度,使得精确的推理很棘手。概率可拖动的模型最近也出现了,这可以确保可以按照模型的大小进行线性计算条件边缘,从而通常从数据中学到该模型。尽管最初仅限于低树宽模型,但最近的可拖动模型(例如总和产品网络(SPN)和概率句子决策图(PSDDS)利用有效的功能表示形式,还捕获了高树宽模型。 在本文中,我们提出以下技术问题:我们可以使用这些模型代表或学习的分布来执行因果询问,例如有关干预和反事实的推理?通过呼吁将这些模型转换为贝叶斯网络的一些现有想法,我们主要回答负面。我们表明,当将SPN转换为因果图介入的推理时,将减少到计算边际分布的情况。换句话说,只有微不足道的因果推理。对于PSDD,情况只会好一些。我们首先提供了一种用于从PSDD构建因果图的算法,该算法引入了增强变量。介入原始变量再次减少到边缘分布,但是当介入增强变量时,可以为PSDD提供确定性但因果关系的确定性但因果关系。
In recent years, there has been an increasing interest in studying causality-related properties in machine learning models generally, and in generative models in particular. While that is well motivated, it inherits the fundamental computational hardness of probabilistic inference, making exact reasoning intractable. Probabilistic tractable models have also recently emerged, which guarantee that conditional marginals can be computed in time linear in the size of the model, where the model is usually learned from data. Although initially limited to low tree-width models, recent tractable models such as sum product networks (SPNs) and probabilistic sentential decision diagrams (PSDDs) exploit efficient function representations and also capture high tree-width models. In this paper, we ask the following technical question: can we use the distributions represented or learned by these models to perform causal queries, such as reasoning about interventions and counterfactuals? By appealing to some existing ideas on transforming such models to Bayesian networks, we answer mostly in the negative. We show that when transforming SPNs to a causal graph interventional reasoning reduces to computing marginal distributions; in other words, only trivial causal reasoning is possible. For PSDDs the situation is only slightly better. We first provide an algorithm for constructing a causal graph from a PSDD, which introduces augmented variables. Intervening on the original variables, once again, reduces to marginal distributions, but when intervening on the augmented variables, a deterministic but nonetheless causal-semantics can be provided for PSDDs.
