Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize unique_* methods #21

Merged
merged 1 commit into from
Apr 21, 2024
Merged

Optimize unique_* methods #21

merged 1 commit into from
Apr 21, 2024

Conversation

jgaskins
Copy link
Contributor

@jgaskins jgaskins commented Apr 1, 2024

Set is faster for inclusion checks than Array — O(1) vs O(n). When generating a lot of fake data that you need to be unique, it makes a pretty big difference. I discovered this while creating 100k test users in my DB with a unique constraint on email addresses.

It even makes a huge difference for this test suite. The difference in runtime on my machine for the generate unique result test is 656ms vs 49ms, or 12x as fast.

Before this commit:

.........................................................................................

Top 10 slowest examples (680m seconds, 99.01% of total time):
  Faker::Address should generate unique result
    656m seconds spec/address_spec.cr:25
  Faker::Internet user_name_with_closed_range_arg
    6.39m seconds spec/internet_spec.cr:30
  Faker::Internet user_name_with_open_range_arg
    6.19m seconds spec/internet_spec.cr:40
  Faker::Internet user_name_with_range_and_separators
    5.94m seconds spec/internet_spec.cr:50
  Faker::Lorem should return deterministic results when seeded
    1.43m seconds spec/lorem_spec.cr:4
  Faker::Internet ip_v4_address
    1.21m seconds spec/internet_spec.cr:105
  Faker::Internet should return deterministic results when seeded
    819µ seconds spec/internet_spec.cr:170
  Faker regexify
    643µ seconds spec/faker_spec.cr:14
  Faker::Internet password_max_with_integer_arg
    512µ seconds spec/internet_spec.cr:70
  Faker::Internet mac_address
    479µ seconds spec/internet_spec.cr:130
  Faker::Internet ip_v6_address
    462µ seconds spec/internet_spec.cr:142

Finished in 686.49 milliseconds
89 examples, 0 failures, 0 errors, 0 pending

After this commit:

.........................................................................................

Top 10 slowest examples (73.1m seconds, 90.82% of total time):
  Faker::Address should generate unique result
    48.5m seconds spec/address_spec.cr:25
  Faker::Internet user_name_with_open_range_arg
    6.37m seconds spec/internet_spec.cr:40
  Faker::Internet user_name_with_closed_range_arg
    6.21m seconds spec/internet_spec.cr:30
  Faker::Internet user_name_with_range_and_separators
    5.83m seconds spec/internet_spec.cr:50
  Faker::Lorem should return deterministic results when seeded
    2.06m seconds spec/lorem_spec.cr:4
  Faker::Internet ip_v4_address
    1.25m seconds spec/internet_spec.cr:105
  Faker::Internet should return deterministic results when seeded
    742µ seconds spec/internet_spec.cr:170
  Faker regexify
    608µ seconds spec/faker_spec.cr:14
  Faker::Internet ip_v6_address
    514µ seconds spec/internet_spec.cr:142
  Faker::Internet password_max_with_integer_arg
    514µ seconds spec/internet_spec.cr:70
  Faker::Internet mac_address
    478µ seconds spec/internet_spec.cr:130

Finished in 80.49 milliseconds
89 examples, 0 failures, 0 errors, 0 pending

Before this commit:

    .........................................................................................

    Top 10 slowest examples (680m seconds, 99.01% of total time):
      Faker::Address should generate unique result
        656m seconds spec/address_spec.cr:25
      Faker::Internet user_name_with_closed_range_arg
        6.39m seconds spec/internet_spec.cr:30
      Faker::Internet user_name_with_open_range_arg
        6.19m seconds spec/internet_spec.cr:40
      Faker::Internet user_name_with_range_and_separators
        5.94m seconds spec/internet_spec.cr:50
      Faker::Lorem should return deterministic results when seeded
        1.43m seconds spec/lorem_spec.cr:4
      Faker::Internet ip_v4_address
        1.21m seconds spec/internet_spec.cr:105
      Faker::Internet should return deterministic results when seeded
        819µ seconds spec/internet_spec.cr:170
      Faker regexify
        643µ seconds spec/faker_spec.cr:14
      Faker::Internet password_max_with_integer_arg
        512µ seconds spec/internet_spec.cr:70
      Faker::Internet mac_address
        479µ seconds spec/internet_spec.cr:130
      Faker::Internet ip_v6_address
        462µ seconds spec/internet_spec.cr:142

    Finished in 686.49 milliseconds
    89 examples, 0 failures, 0 errors, 0 pending

After this commit:

    .........................................................................................

    Top 10 slowest examples (73.1m seconds, 90.82% of total time):
      Faker::Address should generate unique result
        48.5m seconds spec/address_spec.cr:25
      Faker::Internet user_name_with_open_range_arg
        6.37m seconds spec/internet_spec.cr:40
      Faker::Internet user_name_with_closed_range_arg
        6.21m seconds spec/internet_spec.cr:30
      Faker::Internet user_name_with_range_and_separators
        5.83m seconds spec/internet_spec.cr:50
      Faker::Lorem should return deterministic results when seeded
        2.06m seconds spec/lorem_spec.cr:4
      Faker::Internet ip_v4_address
        1.25m seconds spec/internet_spec.cr:105
      Faker::Internet should return deterministic results when seeded
        742µ seconds spec/internet_spec.cr:170
      Faker regexify
        608µ seconds spec/faker_spec.cr:14
      Faker::Internet ip_v6_address
        514µ seconds spec/internet_spec.cr:142
      Faker::Internet password_max_with_integer_arg
        514µ seconds spec/internet_spec.cr:70
      Faker::Internet mac_address
        478µ seconds spec/internet_spec.cr:130

    Finished in 80.49 milliseconds
    89 examples, 0 failures, 0 errors, 0 pending
Copy link
Collaborator

@robacarp robacarp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a clean change, and well proven too. Thanks @jgaskins

@askn askn merged commit 9223359 into askn:master Apr 21, 2024
@askn
Copy link
Owner

askn commented Apr 21, 2024

thanks @jgaskins 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants