WIP: HIV transmission for short term partners #70

mmcleod89 · 2022-08-19T18:30:39Z

Still needs tests to check that things are correct, which might benefit from some refactoring. Not all factors for HIV transmission are accounted for yet, as they don't exist in the model, but I did add a dummy initialisation for viral load groups since that was the most complicated effect and central to the calculation rather than just being a post-factor.

We will need a data file for transmission properties too but I think this should be a separate PR

ageorgou

I've not read through the whole thing yet, only making some quick comments.

I need to look at the transmission probabilities more closely because it seems we get the ratios the wrong way around, although that's probably me reading it wrong!

Does applying stp_HIV_transmission to each person individually give okay performance? (from what we can tell, anyway)

We may need to rethink the age groups and if they should be in common, like the sex type. Otherwise pass a reference to the population into the module?

ageorgou · 2022-08-22T18:29:09Z

src/hivpy/hiv_status.py

+                sub_pop_indices = population.data.index[(population.data[col.SEX]==sex) & (population.data[col.SEX_MIX_AGE_GROUP] == age_group)]
+                sub_pop = population.data.loc[sub_pop_indices]


I think we can get this in one step?

Suggested change

sub_pop_indices = population.data.index[(population.data[col.SEX]==sex) & (population.data[col.SEX_MIX_AGE_GROUP] == age_group)]

sub_pop = population.data.loc[sub_pop_indices]

sub_pop = population.data.loc[(population.data[col.SEX]==sex) & (population.data[col.SEX_MIX_AGE_GROUP] == age_group)]

ageorgou · 2022-08-22T18:33:50Z

src/hivpy/hiv_status.py

+                sub_pop_indices = population.data.index[(population.data[col.SEX]==sex) & (population.data[col.SEX_MIX_AGE_GROUP] == age_group)]
+                sub_pop = population.data.loc[sub_pop_indices]
+                n_stp_total = sum(sub_pop[col.NUM_PARTNERS])  # total number of people partnered to people in this group
+                n_infected = sum(sub_pop.loc[sub_pop[col.HIV_STATUS]][col.NUM_PARTNERS])  # num people parters to HIV+ people in this group


Slightly more idiomatic and (for me) a bit easier to read?

Suggested change

n_infected = sum(sub_pop.loc[sub_pop[col.HIV_STATUS]][col.NUM_PARTNERS]) # num people parters to HIV+ people in this group

n_infected = sum(sub_pop.loc[sub_pop[col.HIV_STATUS], col.NUM_PARTNERS]) # num people parters to HIV+ people in this group

ageorgou · 2022-08-22T18:35:04Z

src/hivpy/hiv_status.py

-    def update_HIV_status(self, population: pd.DataFrame):
+    def update_partner_risk_vectors(self, population):
+        """calculate the risk factor associated with each sex and age group"""
+        # Should we be using for loops here or can we do better? 


We should be able to do this with a groupby, which would give us the subpopulations directly.

So I thought about this using a groupby but I don't know if it can be done because we need to access HIV status and number of partners, but we don't want to group by these, only by sex and age group. I'm not sure how we can access these additional fields if we use a group by

ageorgou · 2022-08-22T18:37:17Z

src/hivpy/hiv_status.py

+                sub_pop_indices = population.data.index[(population.data[col.SEX]==sex) & (population.data[col.SEX_MIX_AGE_GROUP] == age_group)]
+                sub_pop = population.data.loc[sub_pop_indices]
+                n_stp_total = sum(sub_pop[col.NUM_PARTNERS])  # total number of people partnered to people in this group
+                n_infected = sum(sub_pop.loc[sub_pop[col.HIV_STATUS]][col.NUM_PARTNERS])  # num people parters to HIV+ people in this group


Unless I'm misunderstanding, would this be better called n_partners_infected or n_stp_of_infected or similar?

Yes you're correct, that name should have been changed when I corrected this calculation (I mistakenly used the number of people with HIV at first instead of the number of people coupled to someone with HIV) so will change this

ageorgou · 2022-08-22T18:40:55Z

src/hivpy/hiv_status.py

+
+    def set_dummy_viral_load(self, population):
+        """Dummy function to set viral load until this part of the code has been implemented properly"""
+        population.data[col.VIRAL_LOAD_GROUP] = rng.choice(7, population.size)


Let's add this variable to #48? Or make an issue somewhere for properly implementing it.

I very much forgot #48 existed, so yes I will add this to it.

ageorgou · 2022-08-22T18:45:52Z

src/hivpy/hiv_status.py

+    def stp_HIV_transmission(self, person):
+        # TODO: Add circumcision, STIs etc. 
+        """Returns True if HIV transmission occurs, and False otherwise"""
+        stp_viral_groups = np.array([rng.choice(7, p=self.stp_viral_group_rate[1 - person[col.SEX]][age_group]) for age_group in person[col.STP_AGE_GROUPS]])


Instead of the 1 - person[col.SEX], we could add a method to the SexType enum (or a convenience function) that gives the "opposite" sex.

ageorgou · 2022-08-22T18:48:02Z

src/hivpy/hiv_status.py

+        # TODO: Add circumcision, STIs etc. 
+        """Returns True if HIV transmission occurs, and False otherwise"""
+        stp_viral_groups = np.array([rng.choice(7, p=self.stp_viral_group_rate[1 - person[col.SEX]][age_group]) for age_group in person[col.STP_AGE_GROUPS]])
+        HIV_probabilities = np.array([self.stp_HIV_rate[1 - person[col.SEX]][age_group] for age_group in person[col.STP_AGE_GROUPS]])


The SAS code uses "rate" and "probability" almost interchangeably, resulting in some confusion. It'd be worth being more consistent from the start in the new model - if we can understand what each value is...

This is a good point; I suppose the most correct way to use it would be to use rate for things which need to be time-step adjusted, so I which variables you call rate or probability will depend on at which point in the calculation we take the length of the timestep into account.

At the moment this doesn't take timestep length into account at all, so there isn't really a difference between the notion of a rate or a probability yet

ageorgou · 2022-08-22T18:53:00Z

src/hivpy/hiv_status.py

+        stp_viral_groups = np.array([rng.choice(7, p=self.stp_viral_group_rate[1 - person[col.SEX]][age_group]) for age_group in person[col.STP_AGE_GROUPS]])
+        HIV_probabilities = np.array([self.stp_HIV_rate[1 - person[col.SEX]][age_group] for age_group in person[col.STP_AGE_GROUPS]])
+        viral_transmission_probabilities = np.array([max(0, rng.normal(self.transmission_means[group], self.transmission_sigmas[group])) for group in stp_viral_groups])
+        if person[col.SEX] == SexType.Female:


Preferred per the docs on enums:

Suggested change

if person[col.SEX] == SexType.Female:

if person[col.SEX] is SexType.Female:

ageorgou · 2022-08-22T18:59:09Z

src/hivpy/hiv_status.py

+        self.update_partner_risk_vectors(population)
+        HIV_neg_idx = population.data.index[(population.data[col.HIV_STATUS]==False) & (population.data[col.NUM_PARTNERS]>0)]
+        sub_pop = population.data.loc[HIV_neg_idx]
+        population.data.loc[HIV_neg_idx, col.HIV_STATUS] = sub_pop.apply(self.stp_HIV_transmission, axis=1)


Hm, this seems like a prime candidate for vectorization if we can. But I guess the value of STP_AGE_GROUPS will be different for everyone? Perhaps we could combine the two steps (choice of partner age groups and probability of transmission) somehow.

ageorgou

I think there's a bug in the infection probabilities, unless I'm misunderstanding - otherwise looks fine though!

ageorgou · 2022-10-13T08:17:44Z

src/tests/test_hiv_status.py

+    # transform group fails when only grouped by one field
+    # appears to change the type of the object passed to the function!


Just noting that this should be fixed as part of #71, so we can update the following lines when merged.

ageorgou · 2022-10-13T08:19:32Z

src/tests/test_hiv_status.py

+    male_HIV_status = pop.transform_group([col.SEX_MIX_AGE_GROUP, col.SEX], lambda x, y: np.array(
+        [True] * (2 * N_group // HIV_ratio) +
+        [False] * (N_group - 2*N_group // HIV_ratio)), False, males)


Although I'm confused by what the (lambda) function does here: neither argument is used?

This is just because the function passed to transform group has to match the number of arguments to the variables grouped by, which is something we can potentially change by making transform_group more flexible

ageorgou · 2022-10-13T08:22:25Z

src/tests/test_hiv_status.py

Remove when done 😁

This whole test was already removed in a62a146

ageorgou · 2022-10-13T08:29:57Z

src/hivpy/common.py

@@ -35,6 +35,8 @@ class SexType(IntEnum):
    Male = 0
    Female = 1

+def opposite_sex(sex: SexType):
+    return (1 - sex)


Personally I would prefer to keep the abstraction here, so something like

return SexType.Male if sex is SexType.Female else SexType.Female

Ah, but that doesn't work if called on a whole column, does it? Hm. There's probably a way to vectorise this while preserving the abstraction but I guess it's not crucial.

ageorgou · 2022-10-13T08:41:10Z

src/hivpy/hiv_status.py

+                    infection_prob[i] *= self.fold_change_yw
+                else:
+                    infection_prob[i] *= self.fold_change_w
+            return infection_prob


This should be out of the loop? (currently returning after computing probability for the first partner only?)

Yes, it should be out of the loop !

ageorgou · 2022-10-13T08:42:02Z

src/hivpy/hiv_status.py

+    def get_infection_prob(self, sex, age, n_partners, stp_age_groups):
+        # Slow example that avoid repeating the iterations over partners three time by putting them as part of 
+        # one for loop, but for loops in python will be slow.


Yeah, it'd be nice to be able to vectorise this, but having to repeat it for multiple partners makes it tricky... Will try to think of something for a future version.

ageorgou · 2022-10-13T08:54:32Z

src/hivpy/hiv_status.py

+                    population.data[col.SEX_MIX_AGE_GROUP] == age_group)]
+                # total number of people partnered to people in this group
+                n_stp_total = sum(sub_pop[col.NUM_PARTNERS])
+                # num people partered to HIV+ people in this group


Suggested change

# num people partered to HIV+ people in this group

# num people partnered to HIV+ people in this group

ageorgou · 2022-10-13T08:59:06Z

src/hivpy/hiv_status.py

+        stp_viral_groups = np.array([rng.choice(7, p=self.stp_viral_group_rate[opposite_sex(
+            person[col.SEX])][age_group]) for age_group in person[col.STP_AGE_GROUPS]])


I find something like the following a more readable way to break it up:

Suggested change

stp_viral_groups = np.array([rng.choice(7, p=self.stp_viral_group_rate[opposite_sex(

person[col.SEX])][age_group]) for age_group in person[col.STP_AGE_GROUPS]])

stp_viral_groups = np.array([

rng.choice(7, p=self.stp_viral_group_rate[opposite_sex(person[col.SEX])][age_group])

for age_group in person[col.STP_AGE_GROUPS]

])

but that's personal preference! (and the second line is actually longer this way... though less than 100)

mmcleod89 · 2022-10-13T09:36:02Z

test_rred keeps failing on this and another branch but not on my machine; is this because of the pandas issue we saw before?

ageorgou · 2022-10-13T09:52:26Z

No, it's a pandas bug in 1.5.0, see also this comment. We're pinning it to 1.3.x in #71 but haven't added that here.

This reverts commit bfd0fe9.

This reverts commit d6467df.

mmcleod89 added 6 commits July 26, 2022 17:47

Add basic functionality for calculating number of infected partners

37a88ce

Add comments to stp infections

e9b8377

Merge branch 'refactor-transforms' into stp-HIV-transmission

32d9cc7

Merge branch 'development' into stp-HIV-transmission

6b549d2

Calculate risks of stp having HIV & viral load

7a92a3e

Correct stp HIV+ probability, start adding tests

849a361

ageorgou reviewed Aug 22, 2022

View reviewed changes

mmcleod89 added 5 commits August 25, 2022 12:10

Test HIV & VLG probability

4bf3863

Style fixes

2d464a3

Add opposite sex function and TODO comment

369556e

Flake8 compliance, remove abandoned test attempt

a62a146

alterations for style and clarity

1d48503

ageorgou suggested changes Oct 13, 2022

View reviewed changes

Address comments; fix infection prob bug

a9ed103

ageorgou mentioned this pull request Oct 13, 2022

Introduce HIV into the population #69

Merged

mmcleod89 added 2 commits October 13, 2022 11:55

Merge branch 'development' into stp-HIV-transmission

fcb5e3d

Fix flake8

64bf5c4

ageorgou approved these changes Oct 13, 2022

View reviewed changes

mmcleod89 added 5 commits October 13, 2022 12:12

Int num partners being converted to float somehow

a485687

convert num_partners to int directly

d6467df

line length

bfd0fe9

Revert "line length"

40ce037

This reverts commit bfd0fe9.

Revert "convert num_partners to int directly"

ed8a539

This reverts commit d6467df.

mmcleod89 merged commit faf0c3d into development Oct 13, 2022

mmcleod89 deleted the stp-HIV-transmission branch July 13, 2023 13:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: HIV transmission for short term partners #70

WIP: HIV transmission for short term partners #70

mmcleod89 commented Aug 19, 2022

ageorgou left a comment

ageorgou Aug 22, 2022

ageorgou Aug 22, 2022

ageorgou Aug 22, 2022

mmcleod89 Aug 22, 2022

ageorgou Aug 22, 2022

mmcleod89 Aug 22, 2022

ageorgou Aug 22, 2022

mmcleod89 Aug 22, 2022

ageorgou Aug 22, 2022

ageorgou Aug 22, 2022

mmcleod89 Aug 22, 2022

mmcleod89 Aug 22, 2022

ageorgou Aug 22, 2022

ageorgou Aug 22, 2022

ageorgou left a comment

ageorgou Oct 13, 2022

ageorgou Oct 13, 2022

mmcleod89 Oct 13, 2022

ageorgou Oct 13, 2022

mmcleod89 Oct 13, 2022

ageorgou Oct 13, 2022

ageorgou Oct 13, 2022

ageorgou Oct 13, 2022

mmcleod89 Oct 13, 2022

ageorgou Oct 13, 2022

ageorgou Oct 13, 2022

ageorgou Oct 13, 2022

mmcleod89 commented Oct 13, 2022

ageorgou commented Oct 13, 2022

		sub_pop_indices = population.data.index[(population.data[col.SEX]==sex) & (population.data[col.SEX_MIX_AGE_GROUP] == age_group)]
		sub_pop = population.data.loc[sub_pop_indices]

	sub_pop_indices = population.data.index[(population.data[col.SEX]==sex) & (population.data[col.SEX_MIX_AGE_GROUP] == age_group)]
	sub_pop = population.data.loc[sub_pop_indices]
	sub_pop = population.data.loc[(population.data[col.SEX]==sex) & (population.data[col.SEX_MIX_AGE_GROUP] == age_group)]

	n_infected = sum(sub_pop.loc[sub_pop[col.HIV_STATUS]][col.NUM_PARTNERS]) # num people parters to HIV+ people in this group
	n_infected = sum(sub_pop.loc[sub_pop[col.HIV_STATUS], col.NUM_PARTNERS]) # num people parters to HIV+ people in this group

	if person[col.SEX] == SexType.Female:
	if person[col.SEX] is SexType.Female:

		# transform group fails when only grouped by one field
		# appears to change the type of the object passed to the function!

	# num people partered to HIV+ people in this group
	# num people partnered to HIV+ people in this group

		stp_viral_groups = np.array([rng.choice(7, p=self.stp_viral_group_rate[opposite_sex(
		person[col.SEX])][age_group]) for age_group in person[col.STP_AGE_GROUPS]])

WIP: HIV transmission for short term partners #70

WIP: HIV transmission for short term partners #70

Conversation

mmcleod89 commented Aug 19, 2022

ageorgou left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ageorgou left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mmcleod89 commented Oct 13, 2022

ageorgou commented Oct 13, 2022