Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ibaqpy parquet format output operations #84

Open
ypriverol opened this issue Dec 16, 2024 · 0 comments
Open

Ibaqpy parquet format output operations #84

ypriverol opened this issue Dec 16, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@ypriverol
Copy link
Member

Would be great to have a set of functions in the ibaqpy_postprocessing.py that performs the following operations:

  • Remove samples with a percentage of missing values higher than X (e.g. 30%).
  • Plot the boxplot similar to Figure 1, of a given quant variable (e.g. IbaqNorm) of the CVs and distributions of values. Would be good to have also the numbers, meaning standard deviation, etc. Additionally, the user should be able to use an extra variable (e.g. disease) to see the distributions.
  • PCA and T-SNE to cluster all the samples.
  • Imputation with KNNImputer, keep in mind that for the imputations the user should use similar Conditions or use other variables like disease or confounding variables.

Figure 1: Boxplot of IbaqNorm 👇

variability-PXD030304-IbaqNorm-noimputation

@ypriverol ypriverol added the enhancement New feature or request label Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants