Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken CheckLiveBillRunService #841

Merged
merged 5 commits into from
Mar 20, 2024
Merged

Conversation

Cruikshanks
Copy link
Member

https://eaflood.atlassian.net/browse/WATER-4379

We recently updated the CheckLiveBillRunService to support this project handling annual billing (see Handle live bill run check for annual bill runs).

We thought all was well until we spotted some strange behaviour when testing. This app wouldn't create new SROC supplementary bill runs because CheckLiveBillRunService was claiming one was in progress. But a check of the DB found that wasn't the case.

We added some extra debug logging, restarted the app and tried again ... and it worked fine! Ok, chalk it up to 'puters and their weird ways.

Till it happened again. We looked again at the source and immediately spotted a flaw. But we thought it was unrelated.

PR #761 updated the code in the service to this

  const statuses = LIVE_STATUSES

  // Only one annual bill run per region and financial year is allowed. So, we include sent and sending in the statues
  // to check for
  if (batchType === 'annual') {
    statuses.push('sent', 'sending')
  }

  const numberOfLiveBillRuns = await BillRunModel.query()
    .select(1)
    .where({
      regionId,
      toFinancialYearEnding,
      batchType,
      scheme: 'sroc'
    })
    .whereIn('status', LIVE_STATUSES)
    .resultSize()

  return numberOfLiveBillRuns !== 0

The flaw we spotted was .whereIn('status', LIVE_STATUSES). Even though we'd created a new statuses object that we were updating if the batch type is 'annual', we weren't using it in our query. That line should have been .whereIn('status', statuses).

But hang on, our unit tests were saying everything is okay and working for both batch types. Huh!? 🫤

And then we had our 'a-ha' moment. 😁

This line const statuses = LIVE_STATUSES is not copying LIVE_STATUSES to statuses. It is setting statuses and LIVE_STATUSES to be the same thing. It is essentially the same thing as Passing by reference vs passing by value.

Say I want to share a web page with you. If I tell you the URL, I'm passing by reference. You can use that URL to see the same web page I can see. If that page is changed, we both see the changes. If you delete the URL, all you're doing is destroying your reference to that page - you're not deleting the actual page itself.

If I print out the page and give you the printout, I'm passing by value. Your page is a disconnected copy of the original. You won't see any subsequent changes, and any changes that you make (e.g. scribbling on your printout) will not show up on the original page. If you destroy the printout, you have destroyed your copy of the object - but the original web page remains intact.

Our unit tests worked because the supplementary test runs before the annual. When you run the annual it causes the LIVE_STATUSES to get updated hence the query works and the test passes.

Now, you can apply this same behaviour to an environment. When the app first starts LIVE_STATUSES won't include 'sending' and 'sent'. So, supplementary bill runs will go through. Then someone creates an annual bill run. Now LIVE_STATUSES does include those values and because it is declared outside of the scope of the function, it will stay that way for subsequent calls. This means the next supplementary bill run will fail because the query will return a result.

Don't you love programming?! 🤦😧😬

This change fixes the issue.

https://eaflood.atlassian.net/browse/WATER-4379

We recently updated the `CheckLiveBillRunService` to support this project handling annual billing (see [Handle live bill run check for annual bill runs](#761)).

We thought all was well until we spotted some strange behaviour when testing. This app wouldn't create new SROC supplementary bill runs because `CheckLiveBillRunService` was claiming one was in progress. But a check of the DB found that wasn't the case.

We added some extra debug logging, restarted the app and tried again ... and it worked fine! Ok, chalk it up to 'puters and their weird ways.

Till it happened again. We looked again at the source and immediately spotted a flaw. But we thought it was unrelated.

PR #761 updated the code in the service to this

```javascript
  const statuses = LIVE_STATUSES

  // Only one annual bill run per region and financial year is allowed. So, we include sent and sending in the statues
  // to check for
  if (batchType === 'annual') {
    statuses.push('sent', 'sending')
  }

  const numberOfLiveBillRuns = await BillRunModel.query()
    .select(1)
    .where({
      regionId,
      toFinancialYearEnding,
      batchType,
      scheme: 'sroc'
    })
    .whereIn('status', LIVE_STATUSES)
    .resultSize()

  return numberOfLiveBillRuns !== 0
```

The flaw we spotted was `.whereIn('status', LIVE_STATUSES)`. Even though we'd created a new `statuses` object that we were updating if the batch type is 'annual', we weren't using it in our query. That line should have been `.whereIn('status', statuses)`.

But hang on, our unit tests were saying everything is okay and working for both batch types. Huh!? 🫤

And then we had our 'a-ha' moment. 😁

This line `const statuses = LIVE_STATUSES` is not _copying_ `LIVE_STATUSES` to `statuses`. It is _setting_ `statuses` and `LIVE_STATUSES` to be the same thing. It is essentially the same thing as [Passing by reference vs passing by value](https://stackoverflow.com/a/430958/6117745).

> Say I want to share a web page with you. If I tell you the URL, I'm passing by reference. You can use that URL to see the same web page I can see. If that page is changed, we both see the changes. If you delete the URL, all you're doing is destroying your reference to that page - you're not deleting the actual page itself.
>
> If I print out the page and give you the printout, I'm passing by value. Your page is a disconnected copy of the original. You won't see any subsequent changes, and any changes that you make (e.g. scribbling on your printout) will not show up on the original page. If you destroy the printout, you have destroyed your copy of the object - but the original web page remains intact.

Our unit tests worked because the supplementary test runs _before_ the annual. When you run the annual it is causing the `LIVE_STATUSES` to get updated hence the query works and the test passes.

Now, you can apply this same behaviour to an environment. When the app first starts `LIVE_STATUSES` won't include `'sending'` and `'sent'`. So, supplementary bill runs will go through. Then someone creates an annual bill run. Now `LIVE_STATUSES` does include those values and because it is declared outside of the scope of the function, it will stay that way for subsequent calls. This means the next supplementary bill run will fail because the query will return a result.

Don't you love programming! 🤦😧😬

This change fixes the issue.
@Cruikshanks Cruikshanks added the bug Something isn't working label Mar 20, 2024
@Cruikshanks Cruikshanks self-assigned this Mar 20, 2024
@Cruikshanks Cruikshanks marked this pull request as ready for review March 20, 2024 12:42
Copy link
Contributor

@Jozzey Jozzey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cruikshanks Cruikshanks merged commit 259be13 into main Mar 20, 2024
6 checks passed
@Cruikshanks Cruikshanks deleted the fix-broken-live-bill-run-check branch March 20, 2024 13:11
Cruikshanks added a commit that referenced this pull request Mar 20, 2024
https://eaflood.atlassian.net/browse/WATER-4416
https://eaflood.atlassian.net/browse/WATER-4379

We spotted an issue with removing a bill from a bill run not setting the supplementary flags on a the licences involved correctly. We've fixed that, but its led to our QA team rigorously re-creating bill runs over and over.

Doing this they spotted we did have an issue with the `CheckLiveBillRunService` that we fixed in [Fix broken CheckLiveBillRunService](#841).

Now they've spotted something else. `CheckLiveBillRunService` includes batch type as a filter. This means it _will_ allow you to create a supplementary bill run, for example, even if another type of bill run is 'in progress' (queued, processing, or ready). AT the time we thought that was ok but now we know better.

We had actually spotted this in our work to migrate the setup bill run journey (see [Handle bill run setup matches an existing bill run](#810)). We hoped we'd be using our version of the journey by now and we could quietly retire `CheckLiveBillRunService` as part of ongoing maintenance and no one would be any the wiser (it has been live for 7 months now!)

But our QA team would rather clear the issues currently found before bringing in the new setup journey for testing.

So, rather than fix the service, we'll replace it with `DetermineBlockingBillRunService` which matches what the legacy service does and deals with this scenario.
Cruikshanks added a commit that referenced this pull request Mar 21, 2024
https://eaflood.atlassian.net/browse/WATER-4416
https://eaflood.atlassian.net/browse/WATER-4379

We spotted an issue with removing a bill from a bill run and it not setting the supplementary flags on the licences involved correctly. We've fixed that, but it's led to our QA team rigorously re-creating bill runs over and over.

Doing this they spotted we did have an issue with the `CheckLiveBillRunService` that we fixed in [Fix broken CheckLiveBillRunService](#841).

Now they've spotted something else. `CheckLiveBillRunService` includes batch type as a filter. This means it _will_ allow you to create a supplementary bill run, for example, even if another type of bill run is 'in progress' (queued, processing, or ready). At the time we thought that was ok but now we know better.

We spotted this in our work to migrate the setup bill run journey (see [Handle bill run setup matches an existing bill run](#810)). We hoped we'd be using our version of the journey by now and we could quietly retire `CheckLiveBillRunService` as part of ongoing maintenance and no one would be any the wiser (it has been live for 7 months now!)

However, our QA team would rather clear the issues currently found before bringing in the new setup journey for testing.

So, rather than fix the service, we'll replace it with `DetermineBlockingBillRunService` which matches what the legacy service does and deals with this scenario.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants