This repository contains scripts supporting my research about Chain Event Graphs conducted under the Undergraduate Research Support Scheme at the University of Warwick in the summer 2021. The final report detailing my work available at: link.
In September 2022, I run workshops during the CEG Conference at Warwick University. A self-contained jupyter notebook from the workshops is available on binder .
Chain Event Graphs (CEGs) are a family of graphical sta-tistical models derived from well-known probability trees. They form ageneralisation of Bayesian Networks, providing an explicit representa-tion of context-specific conditional dependencies within their topology.This report demonstrates on a real cohort study how CEGs enable usto depict various hypotheses about the data generation mechanisms. Weargue that CEGs, in contrast to the standard framework of generalisedlinear models, can exhibit dependencies between multiple variables in amuch more intuitive way. We also present how CEGs can be used forstatistical inference with incomplete data set, identifying if the data aremissing at random and extracting further conclusions from the patternsof missingness. We additionally discuss the problem of data discretisa-tion and propose a method for supervised discretisation without leavingthe framework of tree-based models.
├── chain-event-graphs
│ ├── R
│ ├── python
│ ├── chain-event-graphs-report
| ├── README.md
│ └── data
R
- contains the R scripts for data processing, EDA and thestagedtrees
submodule with custom algorithms for stage partitioningpython
- contains the python scripts used for fitting the CEGs and generating the final graphs for the reportchain-event-graphs-report
- submodule with the final .tex reportceg-workshops
- submodule with the workshops for CEG Conference, Uniersity of Warwick, September 2022data
- contains raw data (content git ignored, data are safeguarded)