Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accessing tap skims is much slower than taz skims #167

Closed
bstabler opened this issue Mar 17, 2017 · 6 comments
Closed

accessing tap skims is much slower than taz skims #167

bstabler opened this issue Mar 17, 2017 · 6 comments

Comments

@bstabler
Copy link
Contributor

bstabler commented Mar 17, 2017

As part of the multiple-zone system pilot testing, @wusun2 found that accessing tap skims, maz to tap pandas table data, and other data sets, is much slower than accessing the taz skims, which use direct position indexing. The tap skims are using lookups for zone numbers, and the pandas tables are using loc (support for labels) instead of iloc (positions). We need to revise this in the next iteration of the pilot.

@toliwaga
Copy link
Contributor

Pull request #190 fixes this, and other performance issues identified by Wu in https://github.com/UDST/activitysim/wiki/Multiple-Zone-Systems-Tests

Runtime for all NetworkLOS methods including tap_skims (get_tappairs3d) is now scaling pretty linearly with respect to index vector length.

Runtime for best_transit_path (VECTOR_TEST_SIZE 1014699) is now 401.42 seconds (6.7 minutes) on my 2.5 GHz MacBook Pro, down from 6.5 hours on SANDAG 2.30GHz Windows Server.

Here are before/after runtimes on my Mac (except for best_transit_path, which I didn't waste time running in before mode, given Wu's experience with before version...)

### Before runtime optimization
INFO - activitysim - VECTOR_TEST_SIZE 10000000
INFO - activitysim - Time to execute get_taz : 57.722 seconds (0.006 per unit)
INFO - activitysim - Time to execute get_tap : 58.794 seconds (0.006 per unit)
INFO - activitysim - Time to execute get_maz : 76.405 seconds (0.008 per unit)
INFO - activitysim - Time to execute taz_skims : 3.285 seconds (0.0003 per unit)
INFO - activitysim - Time to execute tap_skims : 117.774 seconds (0.012 per unit)
INFO - activitysim - Time to execute get_maz_pairs : 88.374 seconds (0.0088 per unit)
INFO - activitysim - Time to execute get_maz_tap_pairs : 65.745 seconds (0.007 per unit)
INFO - activitysim - Time to execute get_taps_mazs : 37.829 seconds (0.004 per unit)
Total Time network_los 505.928 seconds

VECTOR_TEST_SIZE = 1014699
Time to execute step 'best_transit_path': 23391.2 s (6.5 hours) on SANDAG Windows Server

### After runtime optimization
INFO - activitysim - VECTOR_TEST_SIZE 10000000
INFO - activitysim - Time to execute get_taz : 1.5 seconds (0.0002 per unit)
INFO - activitysim - Time to execute get_tap : 1.868 seconds (0.0002 per unit)
INFO - activitysim - Time to execute get_maz : 7.846 seconds (0.0008 per unit)
INFO - activitysim - Time to execute taz_skims : 3.285 seconds (0.0003 per unit)
INFO - activitysim - Time to execute tap_skims : 5.877 seconds (0.0006 per unit)
INFO - activitysim - Time to execute get_maz_pairs : 6.346 seconds (0.0006 per unit)
INFO - activitysim - Time to execute get_maz_tap_pairs : 3.444 seconds (0.0003 per unit)
INFO - activitysim - Time to execute get_taps_mazs : 38.411 seconds (0.0038 per unit)
Total Time network_los 68.577 seconds

INFO - activitysim - best_transit_path VECTOR_TEST_SIZE 1014699
Time to execute step 'best_transit_path': 401.42 s (6.7 minutes)

@wusun2
Copy link

wusun2 commented Jun 23, 2017

This is great Jeff! I will do a test and update the test memo by 'faking' a SANDAG ABM setting.

@toliwaga
Copy link
Contributor

I look forward to seeing the updated test results. There is probably still some room for performance improvement of best_transit_path but probably more on the order of a factor of two then sixty...

@bstabler
Copy link
Contributor Author

bstabler commented Jun 23, 2017

As discussed on our project call today, we'll look into speeding up get_taps_mazs as well

@bstabler
Copy link
Contributor Author

bstabler commented Jul 8, 2017

  • For most tested methods, the improved software is 30-80 times faster
  • Previously, two model components ("stop location" and "trip mode choices") were not able to complete most of the test scenarios that mimic SANDAG settings, except one trip mode choice scenario, due to the extreme runtime. With the software improvements, all tests finished within reasonable runtime
  • See Wu's memo for more details

@toliwaga
Copy link
Contributor

toliwaga commented Jan 4, 2018

There were several issues with the tests in general, and get_taps_mazs in particular that were negatively affecting the benchmarks.

First off, the timing for the tests included the overhead of randomly selecting the input vectors/dfs. I was doing this with pandas DataFrame sample, which turns out to be pretty slow for large sample sizes (like 6 seconds to sample 20000000 rows from maz_df.) This time was essentially eliminated by replacing pd.DataFrame.sample with np.random.choice on the maz_df index. I was able to replace the pandas sample with a numpy version in all tests except get_maz_pairs and get_maz_tap_pairs, resulting in improved performance in all those tests. Note that this change did not speed up the actual NetworkLOS functions, it just made the tests more accurate.

Secondly, unlike the other test functions, the test function get_taps_mazs() in example_multi/simulation.py was making two calls to get_taps_mazs to exercise the function with and without the attribute selection capability. This resulted in almost doubled runtimes for get_taps_mazs. (I say 'almost' because the attribute selection capability filters out null attribute rows and so returns fewer mazs for each taz and so runs slightly faster than the non-attribute version.)

Thirdly, the per unit runtimes for the tests are slightly misleading for comparisons because get_taps_mazs returns an average of 16 maz/tap rows for each input maz. When teh unit runtimes for get_taps_mazs is expressed in terms of rows returned rather than input rows, its performance looks a lot more similar to the other NetworkLOS functions.

### previous benchmarks
INFO - activitysim - VECTOR_TEST_SIZE 10000000
INFO - activitysim - Time to execute get_taz : 1.5 seconds (0.0002 per unit)
INFO - activitysim - Time to execute get_tap : 1.868 seconds (0.0002 per unit)
INFO - activitysim - Time to execute get_maz : 7.846 seconds (0.0008 per unit)
INFO - activitysim - Time to execute taz_skims : 3.285 seconds (0.0003 per unit)
INFO - activitysim - Time to execute tap_skims : 5.877 seconds (0.0006 per unit)
INFO - activitysim - Time to execute get_maz_pairs : 6.346 seconds (0.0006 per unit)
INFO - activitysim - Time to execute get_maz_tap_pairs : 3.444 seconds (0.0003 per unit)
INFO - activitysim - Time to execute get_taps_mazs : 38.411 seconds (0.0038 per unit)
Total Time network_los 68.577 seconds

New benchmarks:

INFO - activitysim - VECTOR_TEST_SIZE 10000000
INFO - activitysim - Time to execute get_taz : 1.189 seconds (0.0001 per unit)
INFO - activitysim - Time to execute get_tap : 1.248 seconds (0.0001 per unit)
INFO - activitysim - Time to execute get_maz : 1.615 seconds (0.0002 per unit)
INFO - activitysim - Time to execute taz_skims : 2.587 seconds (0.0003 per unit)
INFO - activitysim - Time to execute tap_skims : 4.466 seconds (0.0004 per unit)
INFO - activitysim - Time to execute get_maz_pairs : 5.438 seconds (0.0005 per unit)
INFO - activitysim - Time to execute get_maz_tap_pairs : 3.125 seconds (0.0003 per unit)
INFO - activitysim - Time to execute get_taps_mazs by input : 13.443 seconds (0.0013 per unit)
INFO - activitysim - Time to execute get_taps_mazs by output : 13.443 seconds (0.0001 per unit)

As I remarked, the timings for the get_taps_mazs attribute feature are slightly faster than the version that just returns the list of pairs (at least for 'drive_distance') due to the filtering of missing values:

# get_taps_mazs attribute feature
INFO - activitysim - Time to execute get_taps_mazs drive_distance by input : 11.313 seconds (0.0011 per unit)
INFO - activitysim - Time to execute get_taps_mazs drive_distance by output : 11.313 seconds (0.0001 per unit)

Note: these changes are incorporated in the example_multi branch that I just pushed. I will be submitting a pull request to the master branch shortly.

@toliwaga toliwaga mentioned this issue Jan 4, 2018
@bstabler bstabler closed this as completed Jan 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants