Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: adamburkegh/spm_dim
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v1.3.4
Choose a base ref
...
head repository: adamburkegh/spm_dim
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
  • 9 commits
  • 13 files changed
  • 2 contributors

Commits on Apr 14, 2023

  1. Minor copyedit

    adamburkegh authored Apr 14, 2023

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    720b72e View commit details
  2. Bump to 1.3.5

    adamburkegh committed Apr 14, 2023
    Copy the full SHA
    012287a View commit details

Commits on Jul 25, 2023

  1. Copy the full SHA
    59a68d5 View commit details
  2. Copy the full SHA
    53f085f View commit details

Commits on Aug 10, 2023

  1. Fix XPU labelling

    adamburkegh committed Aug 10, 2023
    Copy the full SHA
    34f1217 View commit details

Commits on Nov 29, 2024

  1. Dead code

    adamburkegh authored Nov 29, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    d0816d1 View commit details
  2. Dead code

    adamburkegh authored Nov 29, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    6deb115 View commit details

Commits on Dec 1, 2024

  1. Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    ad44f8a View commit details
  2. Typo

    adamburkegh authored Dec 1, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    707cd8c View commit details
Showing with 2,718 additions and 372 deletions.
  1. +2 −3 README.md
  2. +1 −1 build.gradle
  3. +257 −0 scripts/crosscmp.R
  4. +423 −0 scripts/dimchoice.R
  5. +254 −205 scripts/evalcor.R
  6. +483 −0 scripts/evalcor_cycle2.R
  7. +483 −0 scripts/evalcor_rn.R
  8. +35 −0 scripts/ica.R
  9. +181 −149 scripts/pca.R
  10. +567 −0 scripts/pca_cycle2.R
  11. +1 −1 scripts/sqd.sh
  12. +31 −0 scripts/tsne.R
  13. +0 −13 src/test/java/qut/pm/spm/measures/relevance/RelevanceTestUtils.java
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -4,9 +4,7 @@ Source code and results investigating stochastic process quality dimensions in p

This also includes a genetic algorithm for mining stochastic process models, called the Stochastic Evolutionary Tree Miner (SETM).

The paper describing this experiment is "Burke, A., Leemans, SJJ, Wynn, M.T, van der Aalst, W.M.D, and ter Hofstede, A.H.M. - Stochastic Process Model-Log Quality Dimensions: An Experimental Study, ICPM 2022".

Further experiments with additional measures and analysis were performed in 2022-2023.
The main paper describing these experiments is: Burke, Adam T., Sander J. J. Leemans, Moe T. Wynn, Wil M. P. van der Aalst, and Arthur H. M. ter Hofstede. 2024. “A Chance for Models to Show Their Quality: Stochastic Process Model-Log Dimensions.” Information Systems 124: 102382. [doi:10.1016/j.is.2024.102382](https://doi.org/10.1016/j.is.2024.102382).

# Development Setup and Installation

@@ -15,6 +13,7 @@ Further experiments with additional measures and analysis were performed in 2022
Checkout [`prom-helpers`](https://github.com/adamburkegh/prom-helpers) and [`prob-process-tree`](https://github.com/adamburkegh/prob-process-tree)

In `prob-process-tree`, `./gradlew test ; ./gradlew publishToMavenLocal`

In `prom-helpers`, `./gradlew test ; ./gradlew publishToMavenLocal`

In `spd_dim`, `./gradlew test`
2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
@@ -116,7 +116,7 @@ jar {


group = 'qut.pm'
version = '1.3.4'
version = '1.3.5'
description = 'sqdimensions'
sourceCompatibility = '11'
targetCompatibility = '11'
257 changes: 257 additions & 0 deletions scripts/crosscmp.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,257 @@
library(dplyr)
library(tidyr)
library(corrplot)
library(factoextra)
library(stats)
library(rgl)
library(RColorBrewer)
# library(caret) for findCorrelations()


resetplots <- function(){
dev.off()
par("mar")
par(mar=c(1,1,1,1))
}

pcaheatmap <- function(pcv){
scale <- 'none'
# scale <- 'row' 'col'
col= colorRampPalette(brewer.pal(8, "Blues"))(25)

# margins=c(10,21)
heatmap(pcv[,1:nfactors], Colv = NA, Rowv = NA, col= col,
cexRow = 1.0, cexCol = 1.5 ,
scale = scale)
}

triplot <- function(ctpc, theta=30, phi=30, zoom = 0.9){
plot3d(ctpc$rotation,
# xlab="Adhesion",ylab="Simplicity",zlab="Entropy")
xlab="PCA1",ylab="PCA2",zlab="PCA3")

text3d(ctpc$rotation[,1:3],
texts=rownames(ctpc$rotation),
col="blue",
cex=0.8)


coords <- NULL
for (i in 1:nrow(ctpc$rotation)) {
coords <- rbind(coords,
rbind(c(0,0,0),
ctpc$rotation[i,1:3]))
}


lines3d(coords,
col="blue",
lwd=1)

view3d(theta = theta, phi = phi, zoom = zoom)
}

# Coordinates of the individuals
coord_func <- function(ind, loadings){
r <- loadings*ind
apply(r, 2, sum)
}


pcapred <- function(rd, ctpc){
scaledind <- scale(rd,
center = ctpc$center,
scale = ctpc$scale)
loadings <- ctpc$rotation
rdcoord <- apply(scaledind, 1, coord_func, loadings )
t(rdcoord)
}

minmaxscale <- function(x,buffer=1) {
range = buffer* (max(x)-min(x))
return((x- min(x)) /range)
}


hpath = "c:/Users/burkeat/bpm/"
# hpath = "c:/Users/Adam/bpm/"

workingPath = paste(hpath,"bpm-dimensions-lab/var/",sep="")
resultsPath = paste(hpath,"bpm-dimensions-lab/results/",sep="")


rdnl = read.csv( paste(workingPath,"expn_c3.csv",sep=""),
strip.white=TRUE)
# rdnl = read.csv( paste(workingPath,"expn_nm_c3.csv",sep=""),
# strip.white=TRUE)

rdnl2 = read.csv( paste(workingPath,"expn_c2.csv",sep=""),
strip.white=TRUE)

rdev = read.csv( paste(workingPath,"eval_c3.csv",sep=""),
strip.white=TRUE)

rdev2 = read.csv( paste(workingPath,"eval_c2.csv",sep=""),
strip.white=TRUE)

# ACTIVITY_RATIO_GOWER, # exclude - correlation with TRGx
# TRACE_RATIO_GOWER_2, # KEEP - TRG correlation group
# TRACE_RATIO_GOWER_3, # exclude - correlation with ARG,TRGx
# TRACE_RATIO_GOWER_4, # exclude - correlation with ARG,TRGx
# STRUCTURAL_SIMPLICITY_STOCHASTIC, # exclude - correlation with SSENC,SSEDC
# STRUCTURAL_SIMPLICITY_ENTITY_COUNT, # exclude - correlation with SSENC,SSS
# STRUCTURAL_SIMPLICITY_EDGE_COUNT, # KEEP - SIMP correlation group
# TRACE_GENERALIZATION_DIFF_UNIQ, # KEEP - TOR/EMT correlation group
# EARTH_MOVERS_TRACEWISE, # exclude - correlation with TOR,TGDU
# TRACE_OVERLAP_RATIO, # exclude - correlation with EMT,TGDU
# ENTROPY_PRECISION_TRACEWISE, # exclude - correlation with APU0
# ENTROPY_FITNESS_TRACEWISE, # exclude - correlation with TGF5
# ENTROPY_PRECISION_TRACEPROJECT, # exclude - correlation with APU0, HJFT
# TRACE_GENERALIZATION_FLOOR_5, # KEEP over correlated EMT - correlates better with EM in eval
# ENTROPY_FITNESS_TRACEPROJECT, # exclude - correlation with APU0, HJPT
# ENTROPIC_RELEVANCE_UNIFORM, # exclude - correlation with HRZ,HRR
# ENTROPIC_RELEVANCE_ZERO_ORDER, # KEEP - entropic relevance correlation group
# ENTROPIC_RELEVANCE_RESTRICTED_ZO, # exclude - correlation with HRU,HRZ
# MODEL_STRUCTURAL_STOCHASTIC_COMPLEXITY, # include?? Known relation
# ALPHA_PRECISION_UNRESTRICTED_ZERO # KEEP - entropy correlation group


# DR view
expc3dr <- rdnl %>% select(ALPHA_PRECISION_UNRESTRICTED_ZERO,
TRACE_GENERALIZATION_FLOOR_5,
TRACE_GENERALIZATION_DIFF_UNIQ,
ENTROPIC_RELEVANCE_ZERO_ORDER,
TRACE_RATIO_GOWER_2,
STRUCTURAL_SIMPLICITY_EDGE_COUNT
# MODEL_STRUCTURAL_STOCHASTIC_COMPLEXITY
)

expc2dr <- rdnl2 %>% select(# ALPHA_PRECISION_UNRESTRICTED_ZERO,
TRACE_GENERALIZATION_FLOOR_5,
ENTROPY_FITNESS_TRACEPROJECT,
ENTROPY_PRECISION_TRACEPROJECT,
# ENTROPIC_RELEVANCE_ZERO_ORDER,
ACTIVITY_RATIO_GOWER,
TRACE_RATIO_GOWER_2,
STRUCTURAL_SIMPLICITY_EDGE_COUNT )

# DR view
eval3dr <- rdev %>% select(ALPHA_PRECISION_UNRESTRICTED_ZERO,
TRACE_GENERALIZATION_FLOOR_5,
TRACE_GENERALIZATION_DIFF_UNIQ,
ENTROPIC_RELEVANCE_ZERO_ORDER,
TRACE_RATIO_GOWER_2,
STRUCTURAL_SIMPLICITY_EDGE_COUNT,
# MODEL_STRUCTURAL_STOCHASTIC_COMPLEXITY,
EARTH_MOVERS
# ,ENTROPY_PRECISION, # excluded for vagueness + expense - better proxies for PCA dim
# ENTROPY_RECALL # excluded for vagueness + expense - better proxies for PCA dim
)




eval2dr <- rdev2 %>% select(# ALPHA_PRECISION_UNRESTRICTED_ZERO,
TRACE_GENERALIZATION_FLOOR_5,
# ENTROPIC_RELEVANCE_ZERO_ORDER,
STRUCTURAL_SIMPLICITY_EDGE_COUNT,
EARTH_MOVERS,
ENTROPY_PRECISION,
ENTROPY_RECALL
)


pcexp3dr <- prcomp(expc3,scale=TRUE)

pceval3dr <- prcomp(eval3,scale=TRUE)

pcexp2dr <- prcomp(expc2,scale=TRUE)

pceval2dr <- prcomp(eval2,scale=TRUE)


# NM view

expc3nm <- rdnl %>% select(# ALPHA_PRECISION_UNRESTRICTED_ZERO,
TRACE_GENERALIZATION_FLOOR_5,
# TRACE_GENERALIZATION_DIFF_UNIQ,
ENTROPIC_RELEVANCE_ZERO_ORDER,
# TRACE_RATIO_GOWER_2,
STRUCTURAL_SIMPLICITY_EDGE_COUNT
# MODEL_STRUCTURAL_STOCHASTIC_COMPLEXITY
)

eval3nm <- rdev %>% select( ALPHA_PRECISION_UNRESTRICTED_ZERO,
# TRACE_GENERALIZATION_FLOOR_5,
# TRACE_GENERALIZATION_DIFF_UNIQ,
ENTROPIC_RELEVANCE_ZERO_ORDER,
# TRACE_RATIO_GOWER_2,
STRUCTURAL_SIMPLICITY_EDGE_COUNT,
# MODEL_STRUCTURAL_STOCHASTIC_COMPLEXITY,
EARTH_MOVERS
# ,ENTROPY_PRECISION, # excluded for vagueness + expense - better proxies for PCA dim
# ENTROPY_RECALL # excluded for vagueness + expense - better proxies for PCA dim
)

pcexp3nm <- prcomp(expc3nm,scale=TRUE)

pceval3nm <- prcomp(eval3nm,scale=TRUE)





resetplots()


# warning: interactive from here down

# ctpc <- pceval2
# ctpc <- pcexp3dr
ctpc <- pceval3dr
# ctpc <- pcexp3nm
# ctpc <- pceval3nm


pcv <- get_pca_var(ctpc)



nfactors <- 3

pcaheatmap(pcv$contrib)
pcaheatmap(pcv$cos2)
pcaheatmap(pcv$cor)
pcaheatmap(ctpc$rotation)

# pairs(ica3$S )

# scree
print(fviz_eig(ctpc)
+ geom_hline(yintercept = 10))


fviz_pca_biplot(ctpc, repel = FALSE,
col.var = "#2E9FDF", # Variables color
col.ind = "#696969", # Individuals color
label = "var")

fviz_pca_biplot(ctpc, axes=c(2,3), repel = FALSE,
col.var = "#2E9FDF", # Variables color
col.ind = "#696969", # Individuals color
label = "var")

fviz_pca_biplot(ctpc, axes=c(1,3), repel = FALSE,
col.var = "#2E9FDF", # Variables color
col.ind = "#696969", # Individuals color
label = "var")

fviz_pca_biplot(ctpc, axes=c(1,4),repel = FALSE,
col.var = "#2E9FDF", # Variables color
col.ind = "#696969", # Individuals color
label = "var")



triplot(ctpc,theta=25, phi=20)

Loading