-
Notifications
You must be signed in to change notification settings - Fork 0
PLS
For the PLS modeling, I first tried to replicate the results from Rudolph et al. 2017 and was successful. The replicated results are shown below for 594 subjects using 10-fold 10-repeated cross-validation. The PLS model used 4 components while the Ridge model used an alpha of 6900. This is a good first step as this confirms that our data is consistent and valid with the paper's data.
Model | Train r^2 | Test r^2 |
---|---|---|
Baseline | 0.806 | 0.427 |
PLS | 0.714 | 0.438 |
Ridge | 0.943 | 0.510 |
Next, I used the same method to model IQ but the results weren't as good as the age results. This is seen in the much lower test r-squared score (0.487 for age compared to 0.089 for IQ). Results are shown below.
Here are the results for using a combination of WISC measures as the target for the PLS.
WISC Measure | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|
WISC_FSIQ | 0.35 | 0.38 | 0.40 | 0.38 | 0.38 |
WISC_VSI | 0.33 | 0.36 | 0.38 | 0.36 | 0.37 |
WISC_FRI | 0.27 | 0.30 | 0.33 | 0.31 | 0.31 |
WISC_WMI | 0.25 | 0.27 | 0.27 | 0.27 | 0.26 |
WISC_PSI | 0.10 | 0.13 | 0.18 | 0.15 | 0.15 |
WISC_VCI | 0.35 | 0.37 | 0.38 | 0.36 | 0.36 |
WISC_BD_Scaled | 0.32 | 0.34 | 0.37 | 0.34 | 0.35 |
WISC_Similarities_Scaled | 0.31 | 0.33 | 0.34 | 0.32 | 0.32 |
WISC_MR_Scaled | 0.22 | 0.25 | 0.28 | 0.26 | 0.27 |
WISC_DS_Scaled | 0.25 | 0.27 | 0.27 | 0.26 | 0.26 |
WISC_Coding_Scaled | 0.06 | 0.08 | 0.12 | 0.10 | 0.09 |
WISC_Vocab_Scaled | 0.34 | 0.37 | 0.38 | 0.36 | 0.36 |
WISC_FW_Scaled | 0.25 | 0.28 | 0.31 | 0.29 | 0.28 |
WISC_VP_Scaled | 0.29 | 0.32 | 0.34 | 0.32 | 0.32 |
WISC_PS_Scaled | 0.18 | 0.20 | 0.21 | 0.20 | 0.19 |
WISC_SS_Scaled | 0.12 | 0.15 | 0.20 | 0.17 | 0.17 |
# With IQ
Target: WISC_FSIQ | r: 0.35
Target: WISC_VSI | r: 0.33
Target: WISC_FRI | r: 0.27
Target: WISC_WMI | r: 0.25
Target: WISC_PSI | r: 0.10
Target: WISC_VCI | r: 0.35
# Without IQ
Target: WISC_VSI | r: 0.33
Target: WISC_FRI | r: 0.27
Target: WISC_WMI | r: 0.25
Target: WISC_PSI | r: 0.10
Target: WISC_VCI | r: 0.35
All | Bin 1 | Bin 2 | Bin 3 | |
---|---|---|---|---|
All | 1 | 0.0833 | 0.0821 | 0.0157 |
Bin 1 | - | 1 | 0.0172 | 0.0117 |
Bin 2 | - | - | 1 | 0.0100 |
Bin 3 | - | - | - | 1 |
Positive and Negative Clipped Values
[[1. 0.06348429 0.09552818 0.01238531] [0.06348429 1. 0.02611998 0.01143025] [0.09552818 0.02611998 1. 0.00417967] [0.01238531 0.01143025 0.00417967 1. ]]
[[1. 0.10837503 0.07464222 0.02468051] [0.10837503 1. 0.03270236 0.01762623] [0.07464222 0.03270236 1. 0.03451587] [0.02468051 0.01762623 0.03451587 1. ]]
All | Bin 1 | Bin 2 | Bin 3 | |
---|---|---|---|---|
All | 1 | 0.0484 | 0.0677 | 0.0255 |
Bin 1 | - | 1 | 0.0119 | 0.0056 |
Bin 2 | - | - | 1 | 0.0031 |
Bin 3 | - | - | - | 1 |
The main goal of this approach is to reduce the overfitting (train > test) of the MI + PLS model by reducing the noise in the data. By doing so, we can improve the performance of the model by making it more generalizable to unseen data (testing set). I tried two approaches: adding noisy samples and grouping samples.
This approach is done by adding noise to the current samples and then adding those noisy samples to the dataset. Specifically, I added Gaussian noise with zero mean (we don't want to change the mean) and varied the standard deviation. The results are shown below.
Baseline | 2x Noise, std / 10 | 2x noise, std | Pure noise | 5x noise | |
---|---|---|---|---|---|
Num Train Samples | 474 | 1896 | 1896 | 474 | 15168 |
Train Score | 0.32 | 0.32 | 0.37 | 0.87 | 0.35 |
Test Score | 0.13 | 0.13 | 0.13 | -0.45 | 0.13 |
This approach is done by grouping some number of subjects and averaging each group to create a pseudo-subject. By doing this, we can reduce the noise by averaging it out and boosting the signal with redundant signals. The results are shown below.
Baseline | Group 6 | Group 6 IQ | |
---|---|---|---|
Num Train Samples | 474 | 113 | 113 |
Train Score | 0.32 | 0.64 | 0.72 |
Test Score | 0.13 | 0.04 | 0.45 |
All Ages - Groups of 3
WISC Primary Index | Num Connections | r^2 | Pearson | Spearman |
---|---|---|---|---|
Intelligence Quotient (FSIQ) | 1000 | 0.331 | 0.593 | 0.622 |
Visual Spatial (VSI) | 1000 | 0.302 | 0.569 | 0.588 |
Verbal Comprehension (VCI) | 1500 | 0.349 | 0.609 | 0.641 |
Fluid Reasoning (FRI) | 2000 | 0.225 | 0.498 | 0.499 |
Working Memory (WMI) | 2000 | 0.201 | 0.477 | 0.493 |
Processing Speed (PSI) | 3000 | -0.024 | 0.231 | 0.232 |