This package facilitates the identification of counterfactual queries in structural causal models via the ID* and IDC* algorithms by Shpitser, I. and Pearl, J. (2007, 2008) https://arxiv.org/abs/1206.5294, https://jmlr.org/papers/v9/shpitser08a.html. A simple interface is provided for defining causal graphs and counterfactual conjunctions. Construction of parallel worlds graphs and counterfactual graphs is done automatically based on the counterfactual query and the causal graph.
For further information, see the tutorial paper on this package published in The R Journal: https://doi.org/10.32614/RJ-2023-053
You can install the latest development version by using the devtools package:
# install.packages("devtools")
devtools::install_github("santikka/cfid")
Directed acyclic graphs (DAG) can be defined using the function dag
in
a syntax similar to the
dagitty
package. This
function accepts edges of the form X -> Y
, X <- Y
, and X <-> Y
,
where the last variant is a shorthand for a latent confounder affecting
both X
and Y
(a so-called bidirected edge). Subgraphs can be defined
using curly braces {...}
. Edges to and from subgraphs connect to all
vertices present in the subgraph. Subgraphs can also be nested. Some
examples of valid constructs include:
dag("X -> Y <- Z <-> W")
dag("{X Y Z} -> {A B}")
dag("X -> {Z <-> {Y W}}")
which define the following DAGs:
flowchart LR;
X((X))-->Y((Y));
Z((Z))-->Y;
W((W))<-.->Z;
flowchart LR;
X((X))-->A((A));
Y((Y))-->A;
Z((Z))-->A;
X-->B((B));
Y-->B;
Z-->B;
flowchart LR;
X((X))-->Z((Z));
X-->Y((Y));
X-->W((W));
Z<-.->Y;
Z<-.->W;
A counterfactual variable is defined by its name, value, and the
submodel that it originated from (a set of interventions). For example,
The function counterfactual_variable
and its shorthand alias cf
can
be used to construct counterfactual variables. This function takes three
arguments: var
, obs
, and sub
that correspond to the variable name,
observed value assignment and subscript (the submodel). For example,
cf(var = "Y", obs = 0, sub = c(X = 0))
#> y_{x}
by default, the value 0 is the “default” or baseline level, and integer
values different from 0 are denoted by primes. For example obs
argument:
cf(var = "Y", obs = 1, sub = c(X = 0))
#> y'_{x}
Purely observational counterfactual variables (of the original causal
model) can be defined by omitting the sub
argument.
Conjunctions of multiple counterfactual variables can be constructed
using the function counterfactual_conjunction
or its shorthand alias
conj
. This function simply takes an arbitrary number of
"counterfacual_variable"
objects as its argument. For example, the
counterfactual conjunction
v1 <- cf("Y", 0)
v2 <- cf("Y", 1, c("X" = 0))
conj(v1, v2)
#> y /\ y'_{x}
Identifiability of (conditional) counterfactual conjunctions can be
determined via the function identifiable
. This function takes the
conjunction gamma
to be identified from the set of all interventional
distributions "dag"
object g
. An optional conditioning conjunction delta
can also be
provided. The solution is provided in LaTeX syntax if the query is
identifiable. For instance, we can consider the identifiability of
flowchart TB;
X((X))-->W((W));
W-->Y((Y));
D((D))-->Z((Z));
Z-->Y;
X<-.->Y;
g1 <- dag("X -> W -> Y <- Z <- D X <-> Y")
v1 <- cf("Y", 0, c(X = 0))
v2 <- cf("X", 1)
v3 <- cf("Z", 0, c(D = 0))
v4 <- cf("D", 0)
c1 <- conj(v1)
c2 <- conj(v2, v3, v4)
identifiable(g = g1, gamma = c1, delta = c2)
#> The query P(y_{x}|x' /\ z_{d} /\ d) is identifiable from P_*.
#> Formula: \frac{\sum_{w} P_{x}(w)P_{w,z}(y,x')}{P(x')}
For more information and examples, please see the package documentation.
- The
causaleffect
package provides the ID and IDC algorithms for the identification of causal effects (among other algorithms). - The
dosearch
package provides a heuristic search algorithm that uses do-calculus to identify causal effects from an arbitrary combination of input distributions. - The
dagitty
package provides various tools for causal modeling, such as finding adjustment sets and instrumental variables. - The
R6causal
package implements an R6 class for structural causal models, and provides tools to simulate counterfactual scenarios for discrete variables.