Replace distance search with capped_distance in water bridge analysis #2480

xiki-tempula · 2020-01-29T16:17:53Z

Changes made in this Pull Request:
Changes the distance calculation to capped_distance in water bridge analysis.

PR Checklist

Tests?
Docs?
CHANGELOG updated?
Issue raised/referenced?

codecov · 2020-01-29T17:35:09Z

Codecov Report

❗ No coverage uploaded for pull request base (develop@5f4915c). Click here to learn what that means.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             develop    #2480   +/-   ##
==========================================
  Coverage           ?   90.28%           
==========================================
  Files              ?      169           
  Lines              ?    23200           
  Branches           ?     3002           
==========================================
  Hits               ?    20945           
  Misses             ?     1654           
  Partials           ?      601

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5f4915c...8ec5951. Read the comment docs.

xiki-tempula · 2020-01-29T17:54:21Z

I will add some test to bump the coverage up

xiki-tempula · 2020-01-30T17:16:52Z

@richardjgowers Hi Richard, I have bumped the coverage. I wonder if your mind does a code review? If you felt it is ok. I will modify the changelog.

package/MDAnalysis/analysis/hbonds/wbridge_analysis.py

richardjgowers · 2020-02-05T09:56:31Z

package/MDAnalysis/analysis/hbonds/wbridge_analysis.py

+            )
+        )
+        hbond_indices = np.where(angles > self.angle)[0]
+        for index in hbond_indices:


This pattern of making a nice numpy array then iterating over it (rather than slicing efficiently) is a bit ugly. Ideally you could make some sort of numpy structured array and slice into it. Everything you're loading into the result array can be obtained through slicing their origin array with hbond_indices, so you could grab the data with n columns operations rather than nrows.

But maybe that's beyond the scope of this PR, this will work

Thank you for the suggestion. I will make it into a PR after this one is merged.

After given it some thought, I'm not quite sure of the representation that you are talking about.
I will start with a list of indices (hydrogen bond donor heavy atom, hydrogen bond donor hydrogen, hydrogen bond acceptor), which is an array of rough size of (1000,3).
Then I will do the distance search, which gives the distance and shrink the number of indexes significantly. Thus, we have an array of (100,4), where the last column is distance.
Then I will do the angle search, which will also shrink the number of indexes and gives the final result of (10,5), where the last column is the angle.

To make a structured array, we need to know the array size in advance.
Do you suggest that I shall make a structured array of (1000,5) and then slice through it? to eventually get to (10,5).
Or is it better to reconstructed an array ar each step?
(1000,3) > (100,3)+(100,1) > (10,4)+(10,1)
Thank you.

You know the size of hbond_indices, so rather than have a loop iteration for each entry, you could create a structured array of size hbond_indices and fill that. Not too important though

package/MDAnalysis/analysis/hbonds/wbridge_analysis.py

xiki-tempula · 2020-02-05T11:02:02Z

@richardjgowers Thank you for the review. I guess the major problem in this PR is that I need to represent some atom group as a list of atoms indexes. To avoid the confusion that some atom groups are represented as indexes while others are represented as atom groups. I made that decision that atom groups should be represented as list of indexes, which changed some idioms.

richardjgowers

@xiki-tempula has this been benchmarked at all?

xiki-tempula · 2020-02-07T11:40:48Z

@richardjgowers Yes, it has been benchmarked.
the order or water bridge versus the new and old implementation

N	new (sec)	old (sec)
0	1	1.3
1	6	12
2	21	40
3	52	95
4	123	205

EDIT: reformatted data as table — @orbeckst

orbeckst · 2020-02-07T21:11:43Z

Does this mean that the new implementation is much slower?

xiki-tempula · 2020-02-07T21:25:24Z

@orbeckst Sorry for the confusion. This is the amount of time which is required to run the same task. The new implementation runs twice as fast as the old one.

richardjgowers

@xiki-tempula can you put in a quick CHANGELOG entry so that you get included in the release notes.

Replace distance search with capped_distance

e38ae9c

xiki-tempula mentioned this pull request Jan 29, 2020

Release 1.0 #2443

Closed

6 tasks

zhiyiwu added 2 commits January 30, 2020 10:25

Add heavy test

15bbdfe

update water update

fb55e7b

xiki-tempula changed the title ~~Replace distance search with capped_distance~~ Replace distance search with capped_distance in water bridge analysis Jan 31, 2020

richardjgowers requested changes Feb 5, 2020

View reviewed changes

richardjgowers reviewed Feb 7, 2020

View reviewed changes

richardjgowers approved these changes Feb 9, 2020

View reviewed changes

xiki-tempula and others added 3 commits February 9, 2020 12:50

Update CHANGELOG

21ae01b

Update CHANGELOG

5f4915c

Merge branch 'develop' into develop

8ec5951

richardjgowers merged commit 121ec06 into MDAnalysis:develop Feb 9, 2020

xiki-tempula mentioned this pull request Feb 10, 2020

Replace the distance search of finding hydrogens with a dictionary lookup #2519

Merged

4 tasks

fiona-naughton added enhancement Component-Analysis labels Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace distance search with capped_distance in water bridge analysis #2480

Replace distance search with capped_distance in water bridge analysis #2480

xiki-tempula commented Jan 29, 2020 •

edited

Loading

codecov bot commented Jan 29, 2020 •

edited

Loading

xiki-tempula commented Jan 29, 2020

xiki-tempula commented Jan 30, 2020

richardjgowers Feb 5, 2020

xiki-tempula Feb 5, 2020

xiki-tempula Feb 7, 2020 •

edited

Loading

richardjgowers Feb 9, 2020

xiki-tempula commented Feb 5, 2020 •

edited

Loading

richardjgowers left a comment

xiki-tempula commented Feb 7, 2020 •

edited by orbeckst

Loading

orbeckst commented Feb 7, 2020

xiki-tempula commented Feb 7, 2020

richardjgowers left a comment

Replace distance search with capped_distance in water bridge analysis #2480

Replace distance search with capped_distance in water bridge analysis #2480

Conversation

xiki-tempula commented Jan 29, 2020 • edited Loading

PR Checklist

codecov bot commented Jan 29, 2020 • edited Loading

Codecov Report

xiki-tempula commented Jan 29, 2020

xiki-tempula commented Jan 30, 2020

richardjgowers Feb 5, 2020

Choose a reason for hiding this comment

xiki-tempula Feb 5, 2020

Choose a reason for hiding this comment

xiki-tempula Feb 7, 2020 • edited Loading

Choose a reason for hiding this comment

richardjgowers Feb 9, 2020

Choose a reason for hiding this comment

xiki-tempula commented Feb 5, 2020 • edited Loading

richardjgowers left a comment

Choose a reason for hiding this comment

xiki-tempula commented Feb 7, 2020 • edited by orbeckst Loading

orbeckst commented Feb 7, 2020

xiki-tempula commented Feb 7, 2020

richardjgowers left a comment

Choose a reason for hiding this comment

xiki-tempula commented Jan 29, 2020 •

edited

Loading

codecov bot commented Jan 29, 2020 •

edited

Loading

xiki-tempula Feb 7, 2020 •

edited

Loading

xiki-tempula commented Feb 5, 2020 •

edited

Loading

xiki-tempula commented Feb 7, 2020 •

edited by orbeckst

Loading