covariance-constrained Gaussian Maximum Likelihood Estimation via semidefinite approximations (SP-GMLE)
James Saunderson James.Saunderson@monash.edu
Chi Pham Chi.Pham@monash.edu
This is a R parser for CVXR to solve the Gaussian MLE problem with added constraints.
- Version 1.0 Initial release
This parser is use to solve the MLE problem for p-dimensional Gaussian models with convex constraints on the covariance matrix. Which is a generalization of the classical linear covariance models.
The problem was proven to be convex if there are enough data by Zwiernik, Uhler, and Richards. This parser produce a epigraph of the negative-log likelihood function by using Gaussian quadrature, a weighted-sum of positive semi-definite epigraphs.
To use this parser, you need an installation of R as well as CVXR. Other than that, MOSEK and Rmosek is highly recommended if you want to add any constraints. Download SPGL.R and put it within the same folder as your other code, then use source("SPGL.R")
.
rm(list = ls())
suppressMessages(require("Rmosek"))
suppressMessages(library("CVXR"))
suppressMessages(source("SPGL.R"))
msk = MOSEK()
import_solver(msk)
The parser take in 5 parameters:
- d: problem dimension
- Sigma: dxd CVXR variable
- Tau: CVXR scalar variable
- Sn: sample covariance of the data
- n: Gaussian quadrature degree
#randomly generated data
data = rnorm(4*20) #20 datapoints
data = matrix(data,nrow = 20)
Sn = cov(data,data)*(dim(data)[1]-1)/(dim(data)[1])
#rescaled from unbiased estimator
n = 5
d = dim(data)[2]
Sigma <- Variable(d,d)
Tau <- Variable(1)
#parser generate epigraph constraints
rep <- represent(Sigma,Tau,n,d,Sn)
constraints <- rep$constraints
#normal CVXR syntax
objective <- Minimize(Tau)
problem <- Problem(objective, constraints)
result <- solve(problem, solver = "MOSEK")
print(result$getValue(Sigma))
print(result$getValue(Tau))
Any CVXR variable define outside the parser (Sigma and Tau in this case) can be access by using result$getValue(variableName)
To add constraint on the covariance matrix, the CVXR variable Sigma can be conditioned directly. Below is an example of a correlation matrix constraint (diagonal entry equal 1)
constraints <- c(constraints,list(
diag(Sigma) == rep(1,d)
))
There are limitations on which way the constraints can be add, most of the time you will need to use the inbuilt CVXR functions. Below is an example of a more complicated constraint, a brownian motion model.
#Tree description matrix
E <- matrix(
c(1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
0, 0, 0, 1,
1, 1, 1, 1
),
ncol = d,
byrow = TRUE )
#Extra variables
r = dim(E)[1] # Number of nodes
v <- Variable(r, nonneg=TRUE) #Branch length
#Brownian motion tree constraint
for (i in seq(from = 1, to = d,by = 1)){
for (j in seq(from = 1, to = d,by = 1)){
ancestor = rep(0,r)
#Boolean, ancestor[k] = 1 if k is an common ancestor of i and j
for (k in seq(from = 1, to = r,by = 1)){
EtE = E[k,] %*% t(E[k,])
if(EtE[i,j] == 1){
ancestor[k] = 1
}
}
#dot product a.v
temp = sum_entries(ancestor*v)
constraints <- c(constraints,list(
X[i,j] == temp
))
}
}
print(result$getValue(v))
Correlation matrix constraint on 4-d problem with 20 randomly-generated data points
rm(list = ls())
suppressMessages(require("Rmosek"))
suppressMessages(library("CVXR"))
suppressMessages(source("SPGL.R"))
msk = MOSEK()
import_solver(msk)
data = rnorm(4*20)
data = matrix(data,nrow = 20)
Sn = cov(data,data)*(dim(data)[1]-1)/(dim(data)[1])
n = 3
d = dim(data)[2]
Sigma <- Variable(d,d)
Tau <- Variable(1)
rep <- represent(Sigma,Tau,n,d,Sn)
constraints <- rep$constraints
constraints <- c(constraints,list(
diag(Sigma) == rep(1,d)
))
objective <- Minimize(Tau)
problem <- Problem(objective, constraints)
result <- solve(problem, solver = "MOSEK")
print(result$getValue(Sigma))
print(result$getValue(Tau))
In cases where you want to step deep into the parser itself, there are options to access the underlying variables, the ids of these underlying variable is stored in rep$variableid
. For example to access variable "x", result[[ids[["x"]]]]
.