-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UniqueGenerator.clear is very slow #1186
Comments
Hiya @kevinburkeomg 👋 This problem really intrigued me. I liked trying to dive into performance problems, and this seemed like a great way to really understand how Faker works. I've tried a few different methods to try and test possible performance enhancements (including one that I think was similar to what you recommended): [NO CLEAR EFFECT] Test 1 - Short Circuiting Clear if @previous_results is empty
[NO CLEAR EFFECT] Test 2 - Skip Calling Clear if @previous_results.empty?
[NO CLEAR EFFECT & TEST FAILURE] Test 3 - Change Hash Init and Clearing
Come to find out, changing this segment of code ( Sadly I can't seem to get around the current performance need. Is there a reason you need to run Hopefully this was helpful. I'd love to hear any other ideas you have on how this might be accomplished. Thanks~ |
The intent was to clean up after any test that had a It's not obvious that it's going to be very slow to clear a few objects from a few hashes. |
Perhaps Faker could store a list of pointers to every object that it's seen that had |
I think that's probably a much better course of action. I had done a lot of experimenting on trying to filter the ObjectSpace to just handle the Faker Objects. Using a pattern where it can best identify all the objects it's created might be the best. I'll try to spend an hour on this tonight to see if I can see any measurable difference. Thanks for the idea! |
Hi, @knittingdev and @kevinburkeomg . Sorry to barge in, but I was interested in speeding up Below are some benchmarks (in all cases, I've labeled the current method using
require "benchmark"
require "faker"
def create_clear(num_times, num_unique = 1)
num_times.times.each do
num_unique.times { Faker::Name.unique.name }
Faker::UniqueGenerator.clear
end
end
def create_old_clear(num_times, num_unique = 1)
num_times.times.each do
num_unique.times { Faker::Name.unique.name }
Faker::UniqueGenerator.old_clear
end
end
Benchmark.bm do |x|
x.report { create_clear(10000, 1) }
x.report { create_old_clear(10000, 1) }
end
# user system total real
# 1.306549 0.021276 1.327825 ( 1.328652)
# 31.035302 0.555962 31.591264 ( 31.602384)
Benchmark.bm do |x|
x.report { create_clear(10000, 0) }
x.report { create_old_clear(10000, 0) }
end
# user system total real
# 0.007437 0.000000 0.007437 ( 0.007428)
# 27.514668 0.024085 27.538753 ( 27.545627)
require "benchmark"
require "faker"
def create_clear(num_times, num_unique = 1)
num_times.times.each do
num_unique.times { Class.new(Faker::Base).unique.name }
Faker::UniqueGenerator.clear
end
end
def create_old_clear(num_times, num_unique = 1)
num_times.times.each do
num_unique.times { Class.new(Faker::Base).unique.name }
Faker::UniqueGenerator.old_clear
end
end
Benchmark.bm do |x|
x.report { create_clear(1000, 20) }
x.report { create_old_clear(1000, 20) }
end
# user system total real
# 1.405916 0.000000 1.405916 ( 1.407707)
# 9.118875 0.004000 9.122875 ( 9.124552) Please let me know if any more tests could be useful. |
Hmm interested and following the issue. |
@knittingdev, @kevinburkeomg, @vbrazo, should the PR be merged? The change is simple enough and the performance improvement looks consistent. |
Faker::UniqueGenerator.clear
callsObjectSpace.each_object
, which takes about 70ms per invocation, even if there are no objects. If you are callingFaker::UniqueGenerator.clear
after every test, the 70ms penalty adds up pretty quickly.Perhaps we could store an object_created boolean, and only call ObjectSpace.each_object if that boolean indicates an object has been created.
For reference, here's the current contents of
unique_generator.rb
.The text was updated successfully, but these errors were encountered: