Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test range param to NHS numbers #2947

Merged
merged 1 commit into from
Jun 3, 2024
Merged

Add test range param to NHS numbers #2947

merged 1 commit into from
Jun 3, 2024

Conversation

neanias
Copy link
Contributor

@neanias neanias commented Apr 29, 2024

The NHS sets aside a range of numbers for test purposes (999 000 0000 to 999 999 9999). By default, this generator creates valid NHS numbers that may be in use by actual people in the UK health services. This param will switch to using the test range which aren't in use anywhere.

Motivation / Background

This Pull Request has been created because the NHS sets aside a range of numbers for test purposes (999 000 0000 to 999 999 9999). By default, this generator creates valid NHS numbers that may be in use by actual people in the UK health services. This param will switch to using the test range which aren't in use anywhere.

Additional information

NHS Digital documentation about "synthetic" data

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug, refactor something, or add a feature.
  • Tests and Rubocop are passing before submitting your proposed changes.

Copy link
Contributor

@thdaraujo thdaraujo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, this makes sense to me. Maybe we should always use the test range (fake numbers) by default. We've done something similar to email addresses recently.

What do you think @stefannibrasil ?

#
# @faker.version 1.9.2
def british_number
base_number = rand(400_000_001...499_999_999)
def british_number(use_test_range: false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should always use a test range by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I put it as an opt-in was I was worried that it would break test suites if tests were checking that they were 4XX numbers rather than 999 numbers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that makes sense too. Maybe the default could be flipped in a future release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my thinking. Deprecate 4XX numbers and swap later

@thdaraujo thdaraujo assigned thdaraujo and neanias and unassigned thdaraujo May 7, 2024
def british_number
base_number = rand(400_000_001...499_999_999)
def british_number(use_test_range: false)
base_number = use_test_range ? rand(999_000_001...999_999_999) : rand(400_000_001...499_999_999)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of using a more readable Ruby alternative, i.e.:

Suggested change
base_number = use_test_range ? rand(999_000_001...999_999_999) : rand(400_000_001...499_999_999)
base_number = if use_test_range
rand(999_000_001...999_999_999)
else
rand(400_000_001...499_999_999)
end

@stefannibrasil
Copy link
Contributor

thank you @neanias this is good to know. As @thdaraujo mentioned, we did something similar for Faker URL and email addresses. I'm sure there are others but there's so many it's hard to spot all opportunities that need to be changed.

We can add a callout for this change when this PR gets released so folks know ahead of bumping the version. Therefore, I am okay with changing it to only generate test numbers by default and add the callout in the release notes.

Since we're working on this file, could you also change check_digit to be a private method?

@neanias
Copy link
Contributor Author

neanias commented May 9, 2024

@stefannibrasil Ok, happy to just move it to be enabled by default! If we're changing the check_digit method to be private, should I also remove the related tests? The code paths will be exercised by the british_number method, I suppose.

@stefannibrasil
Copy link
Contributor

@stefannibrasil Ok, happy to just move it to be enabled by default! If we're changing the check_digit method to be private, should I also remove the related tests? The code paths will be exercised by the british_number method, I suppose.

Yes, that's the approach I follow too. No need to test private methods 👍

Copy link
Contributor

@stefannibrasil stefannibrasil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Thank you for bringing this up. Could you also add a note in the docs for the new behavior?

https://github.com/faker-ruby/faker/blob/main/doc/default/national_health_service.md

lib/faker/default/national_health_service.rb Outdated Show resolved Hide resolved
Comment on lines 3 to 7
The NHS sets aside a range of numbers from 999 000 0000 to 999 999 9999 for
test purposes. This generator uses this range for creating NHS numbers rather
than the previous 400 000 0010 to 499 999 9999 range. The old range could
produce NHS numbers that were in use by real patients in the UK/England and
Wales.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The NHS sets aside a range of numbers from 999 000 0000 to 999 999 9999 for
test purposes. This generator uses this range for creating NHS numbers rather
than the previous 400 000 0010 to 499 999 9999 range. The old range could
produce NHS numbers that were in use by real patients in the UK/England and
Wales.
The NHS sets aside a range of numbers from 999 000 0000 to 999 999 9999 for
test purposes. For more details, see [NHS Synthetic data](https://digital.nhs.uk/services/e-referral-service/document-library/synthetic-data-in-live-environments#synthetic-data-naming-convention).

We can leave the justification for the release notes and keep this directed to how it works 👍

The NHS sets aside a range of numbers for test purposes (999 000 0000 to
999 999 9999). Previously, this generator would generate valid NHS
numbers that may be in use by actual people in the UK health services.
This will now use the test range instead.
Copy link
Contributor

@stefannibrasil stefannibrasil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for bringing this up and improving Faker!

@stefannibrasil stefannibrasil merged commit f11e01a into faker-ruby:main Jun 3, 2024
8 checks passed
@neanias
Copy link
Contributor Author

neanias commented Jun 3, 2024

Glad we got it through in the end! 🚀

@neanias neanias deleted the add-test-range-param-to-nhs-numbers branch June 6, 2024 14:56
@neanias
Copy link
Contributor Author

neanias commented Jul 1, 2024

Hello! Just wondering about this going in the recent changelog. It wasn't mentioned that it's potentially a breaking change for people using the generator and expecting the old range. Is it worth adding it later or is it just done for now?

@stefannibrasil
Copy link
Contributor

HI @neanias thanks for the reminder! I've cut a release with a note about this: https://github.com/faker-ruby/faker/releases/tag/v3.4.2

benilovj added a commit to nhsuk/manage-vaccinations-in-schools that referenced this pull request Sep 18, 2024
A few months ago, we monkey patched our codebase (#1338) to work around the faker gem generating real NHS numbers.
This has now been fixed upstream in faker (faker-ruby/faker#2947) and released in https://github.com/faker-ruby/faker/releases/tag/v3.4.2.
tvararu pushed a commit to nhsuk/manage-vaccinations-in-schools that referenced this pull request Sep 18, 2024
A few months ago, we monkey patched our codebase (#1338) to work around the faker gem generating real NHS numbers.
This has now been fixed upstream in faker (faker-ruby/faker#2947) and released in https://github.com/faker-ruby/faker/releases/tag/v3.4.2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants