-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
latent space bias functions moved into exploration file and notebook … #517
Conversation
…to use these functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anamika this awesome!! thanks so much.
This almost ready 2 things to address:
- There should only be one python notebook please delete one of them.
- We have a strict function comment-style, while not all of the code adheres to it, it is good practice to take these comments and reformat them with type hints etc.
ml4h/explorations.py
Outdated
|
||
CSV_EXT = '.tsv' | ||
|
||
|
||
|
||
#AK latent bias functions added________________________________________________ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete
ml4h/explorations.py
Outdated
#AK latent bias functions added________________________________________________ | ||
|
||
|
||
### Function to divide data into groups with a balanced ratio, and transform the data into a new latent space.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a very specific structure for function comments, They should be multi-line comments with 3 " directly under the function definition. Here is one example:
def plot_prediction_calibration(
prediction: np.ndarray,
truth: np.ndarray,
labels: Dict[str, int],
title: str,
prefix: str = "./figures/",
n_bins: int = 10,
dpi: int = 300,
width: int = 6,
height: int = 6,
):
"""Plot calibration performance and compute Brier Score.
:param prediction: Array of probabilistic predictions with shape (num_samples, num_classes)
:param truth: The true classifications of each class, one hot encoded of shape (num_samples, num_classes)
:param labels: Dictionary mapping strings describing each class to their corresponding index in the arrays
:param title: The name of this plot
:param prefix: Optional path prefix where the plot will be saved
:param n_bins: Number of bins to quantize predictions into
:param dpi: Dots per inch of the figure
:param width: Width in inches of the figure
:param height: Height in inches of the figure
"""
ml4h/explorations.py
Outdated
### Function to divide data into groups with a balanced ratio, and transform the data into a new latent space.. | ||
|
||
|
||
def stratify_and_project_latent_space(stratify_column, stratify_thresh, stratify_std, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add type hints
ml4h/explorations.py
Outdated
|
||
return {f'{stratify_column}': (t2, p2, len(hit)) } | ||
|
||
#Function to create a plot displaying T-statistics v/s Negative Log P-Value for each covariate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type hints and comment re-format
ml4h/explorations.py
Outdated
plt.tight_layout() | ||
|
||
|
||
#Function to calculate angle between two vectors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type hints and comment re-format
ml4h/explorations.py
Outdated
return np.arccos(np.clip(np.dot(v1_u, v2_u), -1.0, 1.0)) * 180 / 3.141592 | ||
|
||
|
||
def unit_vector(vector): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type hints
ml4h/explorations.py
Outdated
return vector / np.linalg.norm(vector) | ||
|
||
#Function to read raw data from a CSV file and generate a representation of the data in a latent space. | ||
def latent_space_dataframe(infer_hidden_tsv, explore_csv): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type hints and comment re-format
ml4h/explorations.py
Outdated
return latent_df | ||
|
||
|
||
#confounder is a variable that influences both the dependent variable and independent variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type hints and comment re-format
ml4h/explorations.py
Outdated
return clf[-1].coef_/clf[0].scale_, train_score | ||
|
||
|
||
def confounder_matrix(adjust_cols, df, space): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type hints and comment re-format
ml4h/explorations.py
Outdated
return np.array(vectors), scores | ||
|
||
# Function to remove confounder variables | ||
def iterative_subspace_removal(adjust_cols, latent_df, latent_cols, r2_thresh=0.01, fit_pca=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type hints and comment re-format
Great work @anamika1302 ! |
#517) * latent space bias functions moved into exploration file and notebook to use these functions
Explorations.py: Moved all latent_space_bias_detection functions to this file.
Latent_space_bias_detection_with_import notebook: This notebook lets you import all function and run the analysis for both ECG & MRI.
Latent_space_bias_detection_with_import notebook: This notebook lets you import all function and run the analysis for both ECG & MRI.
latent_space_bias_detection_terra.ipynb: This is a exact replica of code present on Terra, you can leave this if not required. I just used it before moving functions to explorations file.