explainability: new module #44

JochenSiegWork · 2024-07-05T09:40:03Z

- Add proof of concept for Explainer class and explanation
  data structures to express explanations for feature
  vectors and molecules.
- Add Christian W. Feldmanns visualization code for shap
  weighted heatmaps of the molecular structure.

- Add proof of concept for Explainer class and explanation data structures to express explanations for feature vectors and molecules. - Add Christian W. Feldmanns visualization code for shap weighted heatmaps of the molecular structure.

c-w-feldmann

move to experimental
only ignorer errors where required.

Overall this is a lot to look through and my brain started to nope out...
I think we can leave it as is and improve later. I still think the text below the explanations are confusing.

molpipeline/explainability/explainer.py

molpipeline/explainability/visualization/utils.py

tests/test_explainability/test_shap_explainers.py

c-w-feldmann

Sorry still haven't finished everything. Just submitting a WIP

c-w-feldmann · 2024-12-09T09:47:18Z

molpipeline/experimental/explainability/explainer.py

+    return feature_matrix
+
+
+def _convert_to_array(value: Any) -> npt.NDArray[np.float64]:


I think we could use numpy.atleast_1d instead.

c-w-feldmann · 2024-12-09T09:56:07Z

molpipeline/experimental/explainability/explainer.py

+    return atom_weights
+
+
+ShapExplanation: TypeAlias = list[


If I read this correctly TypeAlias is deprecated https://docs.python.org/3/library/typing.html#typing.TypeAlias

Maybe:

ShapExplanation = Sequence[SHAPFeatureExplanation | SHAPFeatureAndAtomExplanation]

If its obvious some varaibles aren't required to be explicitly typed.

Also the name reads to me like this was an implemented class. Maybe ExplanationList or something like this?

You're right. I removed the type and now it's explicitly written as list[SHAPFeatureExplanation | SHAPFeatureAndAtomExplanation]. However, I sticked with the list and not Sequence type, since we use only lists.

c-w-feldmann

Notebook: After cell [16]:
We hypothesis -> We hypothesi(z/s)e?

c-w-feldmann · 2025-01-07T10:47:53Z

molpipeline/experimental/explainability/explainer.py

+    raise ValueError("Value is not a scalar or numpy array.")
+
+
+def _get_prediction_function(pipeline: Pipeline | BaseEstimator) -> Any:


-> Callable[[npt.Arraylike], npt.Arraylike]

c-w-feldmann · 2025-01-09T12:47:38Z

molpipeline/experimental/explainability/explainer.py

+    return atom_weights
+
+
+ShapExplanation: TypeAlias = list[


Maybe:

ShapExplanation = Sequence[SHAPFeatureExplanation | SHAPFeatureAndAtomExplanation]

c-w-feldmann · 2025-01-09T12:48:39Z

molpipeline/experimental/explainability/explainer.py

+    return atom_weights
+
+
+ShapExplanation: TypeAlias = list[


If its obvious some varaibles aren't required to be explicitly typed.

c-w-feldmann · 2025-01-09T12:51:49Z

molpipeline/experimental/explainability/explainer.py

+
+        # determine type of returned explanation
+        featurization_element = self.featurization_subpipeline.steps[-1][1]  # type: ignore[union-attr]
+        self.return_element_type_: type[


type hints for class/instance variables should be typed outside of functions. It would be best to do this above the init function.

c-w-feldmann · 2025-01-09T12:56:21Z

molpipeline/experimental/explainability/explainer.py

+    return atom_weights
+
+
+ShapExplanation: TypeAlias = list[


Also the name reads to me like this was an implemented class. Maybe ExplanationList or something like this?

c-w-feldmann · 2025-01-09T13:04:43Z

molpipeline/experimental/explainability/explainer.py

+import shap
+from scipy.sparse import issparse, spmatrix
+from sklearn.base import BaseEstimator
+from typing_extensions import override


better use typing.override

molpipeline/experimental/explainability/visualization/heatmaps.py

c-w-feldmann · 2025-01-09T13:28:34Z

tests/test_experimental/test_explainability/test_shap_explainers.py

+            if isinstance(explainer, SHAPTreeExplainer) and isinstance(
+                estimator, GradientBoostingClassifier
+            ):
+                # there is currently a bug in SHAP's TreeExplainer for GradientBoostingClassifier


maybe log a warning or info so me might remember this :D

I think we don't need a warning since the current test code will break if they fix the bug. So we'll know when the dependency updates. This should be safely captured by the CI/CD pipeline.

JochenSiegWork · 2025-01-15T10:36:40Z

Notebook: After cell [16]: We hypothesis -> We hypothesi(z/s)e?

All comments on the notebook are addressed

- pylint fails to parse f-strings with same quotation marks as the string and in the curly brackets, e.g. f"{"foo"}" but f"{'foo'}" would work.

c-w-feldmann · 2025-01-15T15:41:03Z

molpipeline/experimental/explainability/visualization/heatmaps.py

-            Callable[[npt.NDArray[np.float64]], npt.NDArray[np.float64]]
-        ] = []
+        if function_list is not None:
+            self.function_list: list[


type hints for class/instance vars should be defined outside of functions.
...
I read a bit into this and it seems that its not actually a PEP, but only done in this way in PEP 526.

The PEP itself ist now also marked as historical document.
But the rest of the molpipeline code is done in the same way and I think that it's still a good Idea.
Current examples still do it this way: https://typing.readthedocs.io/en/latest/spec/class-compat.html

c-w-feldmann

almost there

JochenSiegWork · 2025-01-16T09:18:26Z

Now we are waiting for numba Python 3.13 support which is needed for SHAP. The release status can be seen here numba/numba#9896

JochenSiegWork marked this pull request as draft July 5, 2024 09:40

JochenSiegWork self-assigned this Jul 5, 2024

JochenSiegWork force-pushed the explainability_module branch from bf9714d to 3869edb Compare August 13, 2024 08:02

JochenSiegWork force-pushed the explainability_module branch from cdbece2 to 21a13bf Compare October 9, 2024 07:01

JochenSiegWork force-pushed the explainability_module branch 2 times, most recently from 9d236b9 to d6d880e Compare November 21, 2024 12:24

JochenSiegWork added 24 commits November 25, 2024 13:49

explainability: new module

8e24c7a

- Add proof of concept for Explainer class and explanation data structures to express explanations for feature vectors and molecules. - Add Christian W. Feldmanns visualization code for shap weighted heatmaps of the molecular structure.

explainability: changes for numpy2

8b39bf7

explanability: add new visualization

6573676

explainability: linting vis code

3bd54d1

explainability: fix linting

36f2517

explainability: vis linting

3fdeee5

explainability: more linting

bd78f18

linting

2f6c521

linting again

83df2dc

explaonability: add matplotlib as dependency for visualization

787e79f

explainability: improve speed

368b9af

explainability: speed up unittests

59b550f

explainability: suppress already checked mypy warning

3cbf568

mypy

d69d2b1

mypy + rename interface atom_weights

c8ac95f

explainability: add more visualization

206af69

explainability: refactored shap heatmap visualization

76395c4

linting

9db8ca0

linting

e1176e4

linting

2e89391

linting

e9e0102

linting

45008c9

linting

4f5c186

explainability: handle fill values better

d883236

JochenSiegWork requested a review from c-w-feldmann December 3, 2024 13:38

c-w-feldmann requested changes Dec 4, 2024

View reviewed changes

molpipeline/explainability/explainer.py Outdated Show resolved Hide resolved

molpipeline/explainability/visualization/utils.py Outdated Show resolved Hide resolved

tests/test_explainability/test_shap_explainers.py Outdated Show resolved Hide resolved

JochenSiegWork added 4 commits December 4, 2024 16:57

Christian comments 1

52387f5

linting

8d3ae06

pydocstyle

55036aa

rename xai notebook and update import

a970d3d

c-w-feldmann reviewed Dec 9, 2024

View reviewed changes

c-w-feldmann requested changes Jan 9, 2025

View reviewed changes

JochenSiegWork and others added 11 commits January 14, 2025 10:26

Merge branch 'main' into explainability_module

64220eb

Merge branch 'main' into explainability_module

0733089

fix mypy error

e7f6653

fix mypy

054a3a1

ignore mypy error

0560abf

mypy

be76892

first part code review

3bc87e3

linting

dcf95c2

linting fix

87c4888

review comments typing

66014c5

Review comments for notebook

7d89b70

JochenSiegWork added 5 commits January 15, 2025 15:19

rework visualization of present/absent features text

46949b5

mypy

38ec057

isort

2038e29

mypy

ea186cc

pylint can't parse f-strings correctly

cd59c94

- pylint fails to parse f-strings with same quotation marks as the string and in the curly brackets, e.g. f"{"foo"}" but f"{'foo'}" would work.

c-w-feldmann reviewed Jan 15, 2025

View reviewed changes

c-w-feldmann requested changes Jan 15, 2025

View reviewed changes

heatmaps: move type defintion outside function

2213bb1

c-w-feldmann approved these changes Jan 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

explainability: new module #44

explainability: new module #44

JochenSiegWork commented Jul 5, 2024

c-w-feldmann left a comment •

edited by JochenSiegWork

Loading

c-w-feldmann left a comment

c-w-feldmann Dec 9, 2024

JochenSiegWork Jan 14, 2025

c-w-feldmann Dec 9, 2024

c-w-feldmann Jan 9, 2025

c-w-feldmann Jan 9, 2025

c-w-feldmann Jan 9, 2025

JochenSiegWork Jan 15, 2025

c-w-feldmann left a comment

c-w-feldmann Jan 7, 2025

JochenSiegWork Jan 14, 2025

c-w-feldmann Jan 9, 2025

c-w-feldmann Jan 9, 2025

c-w-feldmann Jan 9, 2025

JochenSiegWork Jan 15, 2025

c-w-feldmann Jan 9, 2025

c-w-feldmann Jan 9, 2025

JochenSiegWork Jan 14, 2025

c-w-feldmann Jan 9, 2025

JochenSiegWork Jan 14, 2025

JochenSiegWork commented Jan 15, 2025

c-w-feldmann Jan 15, 2025 •

edited

Loading

JochenSiegWork Jan 16, 2025

c-w-feldmann left a comment

JochenSiegWork commented Jan 16, 2025

		return feature_matrix


		def _convert_to_array(value: Any) -> npt.NDArray[np.float64]:

		raise ValueError("Value is not a scalar or numpy array.")


		def _get_prediction_function(pipeline: Pipeline \| BaseEstimator) -> Any:

explainability: new module #44

Are you sure you want to change the base?

explainability: new module #44

Conversation

JochenSiegWork commented Jul 5, 2024

c-w-feldmann left a comment • edited by JochenSiegWork Loading

Choose a reason for hiding this comment

c-w-feldmann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

c-w-feldmann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JochenSiegWork commented Jan 15, 2025

c-w-feldmann Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

c-w-feldmann left a comment

Choose a reason for hiding this comment

JochenSiegWork commented Jan 16, 2025

c-w-feldmann left a comment •

edited by JochenSiegWork

Loading

c-w-feldmann Jan 15, 2025 •

edited

Loading