Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in stats when setting GroupOrder #9

Closed
ManuelaS opened this issue Dec 12, 2019 · 1 comment
Closed

Change in stats when setting GroupOrder #9

ManuelaS opened this issue Dec 12, 2019 · 1 comment

Comments

@ManuelaS
Copy link

Hi,

I came across some unexpected behaviour when passing the "GroupOrder" flag. Specifically, the p-value differs depending on the order and in the last example below is a complex number. The GroupOrder flag worked as expected when testing it with the example datasets or other grouping variables of this dataset. I suspect the problem is related to convergence (the groups "appendix" and "others" have few patients with no events)

`
% Data from https://bitbucket.org/manuela_s/translating_network_biomarkers_into_the_clinic/src/master/data/clinical_data.csv
data = readtable('clinical_data.csv');
% Drop patients with missing censoring information
data.is_censored_overall_survival = categorical(data.is_censored_overall_survival);
data = data(~isundefined(data.is_censored_overall_survival), :);

% Simple call generates correct Kaplan-Meier plot and reports stats
[p, fh, stats] = MatSurv(...
data.overall_survival_months,...
data.is_censored_overall_survival=='no',...
data.tumour_site)

% Specifying "GroupsToUse" and including all the groups in the same order
% as default (NoOp) produces the expected Kaplan-Meier plot and stats
[p, fh, stats] = MatSurv(...
data.overall_survival_months,...
data.is_censored_overall_survival=='no',...
data.tumour_site,...
'GroupsToUse', {'appendix', 'distal', 'others', 'proximal', 'rectal'})

% Specifying "GroupsToUse" and including all the groups but in a different order
% than default produces the expected Kaplan-Meier plot, but reports
% different stats (and a complex number p-value in this case)
[p, fh, stats] = MatSurv(...
data.overall_survival_months,...
data.is_censored_overall_survival=='no',...
data.tumour_site,...
'GroupsToUse', {'appendix', 'proximal', 'distal', 'rectal', 'others'})
`

@aebergl
Copy link
Owner

aebergl commented Dec 15, 2019

Hello,
Thanks for taking time to find this bug! It turns out that the calculations of the log rank p-value do not work when there are two or more groups with no events. I have added a error checking for the groups that warns about this condition but still displays the KM-plot. I also by default remove groups with less than two samples. I have also added a so one can easily merge Groups with a multi-level cell structure as GroupsToUse input variable.
Your example would be fixed using:
[p, fh, stats] = MatSurv( data.overall_survival_months,data.is_censored_overall_survival=='no', data.tumour_site, 'GroupsToUse', {{'appendix+others','appendix','others'}, 'proximal', 'distal', 'rectal'});

@aebergl aebergl closed this as completed Dec 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants