-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathPACA.Rd
107 lines (98 loc) · 4.71 KB
/
PACA.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/paca.R
\name{paca}
\alias{paca}
\title{Phenotype Aware Components Analysis (PACA)}
\usage{
paca(
X,
Y,
k = NULL,
scale = FALSE,
rank = 5,
thrsh = 10,
ccweights = FALSE,
info = 1
)
}
\arguments{
\item{X}{\eqn{m} by \eqn{n_1} matrix, where \eqn{m > n_1}; \cr
Case (foreground) input data matrix. \cr
Note: this input data needs to be scaled along the samples axis before being provided as input.
This preprocessing can be done using the \code{\link{transformCCAinput}} function.}
\item{Y}{\eqn{m} by \eqn{n_0} matrix, where \eqn{m > n_0}; \cr
Control (foreground) input data matrix. \cr
Note: this input data needs to be scaled along the samples axis before being provided as input.
This preprocessing can be done using the \code{\link{transformCCAinput}} function.}
\item{k}{positive integer, optional (default: \eqn{NULL}); \cr
Number of, \eqn{k}, dimensions of shared variation to be removed from case data \code{X}. \cr
When \eqn{k = NULL} (default), K is automatically infered, i.e, we run autoPACA by default.}
\item{scale}{bool, optional (default: \eqn{FALSE}); normalize (center+scale) each matrix column-wise}
\item{rank}{Positive integer, optional (default \eqn{2}); \cr
Number of dominant principle components to be computed for the corrected case data.}
\item{thrsh}{Positive real value, optional (default \eqn{10}); \cr
Threshold value for the maximum ratio of variance in \emph{PACA} corrected \code{X} PCs and the variance it explain in Y
which indicates the presence of residual shared variation in X.}
\item{ccweights}{bool, optional (default \eqn{FALSE}); \cr
If \eqn{TRUE}, return the \emph{PACA} corrected case data (\code{xtil}) ONLY.}
\item{info}{Integer, optional (default: 0); \cr
Verbosity level for the log generated. \cr
0: Errors and warnings only \cr
1: Basic informational messages \cr
2: More detailed informational messages \cr
3: Debug mode, all informational log is dumped}
}
\value{
By default, \code{paca} returns a list containing the following components:
\describe{
\item{Xtil}{ \eqn{m} by \eqn{n_1} matrix; \cr
the \emph{PACA} corrected case data, i.e., the data with the case-specific variation only.
}
\item{U0}{ \eqn{m} by \eqn{k} matrix; \cr
the \emph{PACA} shared components that are removed from \eqn{X}.
}
\item{x}{ \eqn{n_1} by \eqn{rank} matrix; \cr
the projections / scores of the \emph{PACA} corrected case data (\code{Xtil}).
}
\item{rotation}{ \eqn{m} by \eqn{rank} matrix; \cr
the rotation (eigenvectors) of the \emph{PACA} corrected case data (\code{Xtil}).
}
\item{k}{ the number of shared components removed, int
}
}
When \eqn{ccweights = TRUE}, \code{paca} returns a list containing the CCA direction and variates along withe the \emph{PACA} principle components:
\describe{
\item{Xtil}{ \eqn{m} by \eqn{n_1} matrix; \cr
the \emph{PACA} corrected case data, i.e., the data with the case-specific variation only.
}
\item{U0}{ \eqn{m} by \eqn{k} matrix; \cr
the \emph{PACA} shared components that are removed from \eqn{X}.
}
\item{x}{ \eqn{n_1} by \eqn{rank} matrix; \cr
the projections / scores of the \emph{PACA} corrected case data (\code{Xtil}).
}
\item{rotation}{\eqn{m} by \eqn{rank} matrix; \cr
the rotation (eigenvectors) of the \emph{PACA} corrected case data (\code{Xtil}).
}
\item{k}{ the number of shared components removed, int
}
\item{A}{ the loadings for \eqn{X}
}
\item{B}{ the loadings for \eqn{Y}
}
\item{U}{ canonical variables of \eqn{X}, calculated by column centering \eqn{X} and projecting it on \eqn{A}
}
\item{V}{ canonical variables of \eqn{Y}, calculated by column centering \eqn{Y} and projecting it on \eqn{B}
}
}
}
\description{
Phenotype Aware Components Analysis (PACA) is a
contrastive learning approach leveraging canonical correlation analysis to robustly capture weak sources of
subphenotypic variation. Given case-control data of any modality, PACA highlights the dominant variation in a
subspace that is not affected by background variation as a putative representation of phenotypic heterogeneity. We do so by
removing the top \code{k} components of shared variation from the cases (or foreground) \code{X}.
In the context of complex disease, PACA learns a gradient of variation unique to cases \code{X} in
a given dataset, while leveraging control samples \code{Y} for accounting for variation and imbalances of biological
and technical confounders between cases and controls.
}