-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about handling NAs #6
Comments
Missing values are tricky and I am afraid that I don't have a one size fits all answer for you. However, here are some suggestions:
It would be nice not to see clusters in either of the two. You could also look at the correlations of samples before and after correction. Ideally, you want to see that intra-batch sample correlations are not higher than inter-batch correlations after correction. Also, what happens if you look at some peptide levels diagnostics rather than proteome wide ones? Do your still see a batch dependent trend in individual peptide intensities? |
We are trying to implement proBatch into our data analysis pipelines, but our data has a lot of NAs. We've found that batch correction only works when we filter out peptides containing NAs which leaves us with a small fraction of our data. If we keep peptides with at most 20% NAs or at most 50% NAs, then batch effects are still seen in our data after batch correction. What is the best way to handle NAs in our data? Should they be removed before doing batch correction?
We also found that doing NA imputation with row means prior to running the PCA and PVCA (but after batch correction), then batch effects seem to disappear. Using the default -1 imputation for NAs in PCA and PVCA still shows clear batch effects in the data after batch correction. Is -1 imputation the best method for handling NAs when trying to view batch effects in PCA and PVCA?
The text was updated successfully, but these errors were encountered: