-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding defaults #4
Conversation
there are three basic change here: * addings activitysim/defaults/datasources.py which are similar to urbansim_defaults * an example directory which is similar to bayarea_urbansim (i.e. has data and config dirs right now) * and some sample notebooks for typical workflows We might separate some of this stuff into separate repos in the future but for now I think there is no need.
I'd have only one .gitignore, e.g. add a line for |
Problem is in git you can't add empty directories (I've never understood why) so .gitignore is the placeholder that puts the directory there. I will probably end up adding some small data files and can remove it then. Until then I guess it's a bit awkward... |
I like to put a README in there to explain why there's an empty directory. |
right now only using the first 8 rows of the spec, but we're successfully simulating choices This works by taking the UEC spreadsheet and keeping the coefficients almost formatted as-is. I say almost because the filters and alt names had to be tweaked a bit to match Pandas conventions. This commit is working but brings up a few questions that I will post as issues on github in a bit. This is clearly not a final implementation but something to elicit early feedback.
A separate issue of this pull request is that it's terribly slow right now. Running on the full household table takes about 200s with only 8 columns (it looks like the spec has 150 columns). @jiffyclub Any ideas here? I will profile this but I'm guessing it's in the calls to "eval". |
Just leaving a note to myself for todo items that this PR does NOT include for a complete reimplementation of auto ownership
|
I added UDST/urbansim@f88f665 to the dcm branch to speed up making the interaction dataset. It still takes 15 seconds, but that's better than 2 minutes. The runtime is dominated now by the doing the draws, e.g. https://github.com/synthicity/activitysim/pull/4/files#diff-ae8114eddeabee70b62a76c66a5b85a5R47. |
Perfect - I'll try it out. I also just realized the place the fast choice code exists is in mnl.py - i.e. just don't pass returnprobs=True and it makes the choices for you the best way. Totally forgot about that. I'll give it a shot. |
OK - with the fix you put into interaction, we went from 200s to 100s. I just committed the change to use the choice making from mnl.py and we're down to 43s. Whenever you get a chance, please do another profile of the latest commit and make sure there's not any other easy optimizations. |
Now there are about 15 seconds going to data processing and 36 sec going to prediction. Within prediction ~17 seconds are going to |
the big change here is that a lot of the columns could not be expressed as DataFrame.eval calls and so I'm directly making Python "eval" calls if the expression starts with an "@". This is a proposal and not necessarily permanent functionality - something else to talk about. I like the way this works though as it puts quite a bit of power directly in the spreadsheet for the model specs, much the same way it is now. We have the option of never doing a DataFrame.eval call but these calls are a bit easier to express and are apparently much faster. This checkin also has the performance improvements on the choices and converts windows newlines to regular newlines so you can see the spec on github now
OK on a server, and with the speedups, the auto ownership model is simulating in 50s. Still a few things to add which will slow this down a bit, but so far so good. |
…ibility.csv Easier to add the logsums than I thought it would be. Also moved the standard methods into a new module - activitysim.py - will add unit tests and docstrings if we all think this is the right direction to be headed Also added a global param to settings.yaml called household_sample_size to run with a subset of the total households when debugging
description_name="Description", | ||
expression_name="Expression"): | ||
""" | ||
Read in the excel file and reformat for machines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excel!?! No thanks!
Adding defaults - this PR adds some simple and standard data sources, variables, and models to ActivitySim. I'm going to start merging PRs even though this code is still subject to change somewhat in the future.
fix issue in activitysim.abm.models.util.mode where reset_index level…
cross-border changes after rebase from upstream develop
BayDAG Contribution #4: Joint Tour Destination Performance
Not planning on pulling yet - just want to leave this open to see the diff
Adding basic data sources and several computed variables
So far so good