My try to speed up python's random.sample method. Original method I've got from cpythons git mirror.
def improved_sample(population, k, shuffle=True):
population
and k
used like in origin method. shuffle
implemented due to algorithm behavior. It outputs elements in reversed order. So if you need behavior like in original sample (not only random elements, but in random order) you have leave shuffle
as is(True
). But to have maximum speed call improved_sample
with False
as third parameter.
Speed test for Original sample
Getting 900 elements out of 5000 - 1.7298053771223894 seconds
Getting 2900 elements out of 5000 - 4.912327008111639 seconds
Getting 4900 elements out of 5000 - 8.506528223678792 seconds
Speed test for Improved sample
Getting 900 elements out of 5000 - 1.0697965934666236 seconds
Getting 2900 elements out of 5000 - 2.8724894329827144 seconds
Getting 4900 elements out of 5000 - 4.640079082006107 seconds
Speed test for Improved sample with reversed saved order
Getting 900 elements out of 5000 - 0.26750917583522593 seconds
Getting 2900 elements out of 5000 - 0.356656296830522 seconds
Getting 4900 elements out of 5000 - 0.4441971555274584 seconds
Improved sample with reversed saved order (doesn't make sense to show it, but still):