Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding defaults #4

Merged
merged 8 commits into from
Jan 5, 2015
Merged

Adding defaults #4

merged 8 commits into from
Jan 5, 2015

Conversation

fscottfoti
Copy link
Contributor

Not planning on pulling yet - just want to leave this open to see the diff

Adding basic data sources and several computed variables

So far so good

there are three basic change here:

* addings activitysim/defaults/datasources.py which are similar to urbansim_defaults
* an example directory which is similar to bayarea_urbansim (i.e. has data and config dirs right now)
* and some sample notebooks for typical workflows

We might separate some of this stuff into separate repos in the future but for now I think there is no need.
@jiffyclub
Copy link
Contributor

I'd have only one .gitignore, e.g. add a line for example/data/* to the root .gitignore.

@fscottfoti
Copy link
Contributor Author

Problem is in git you can't add empty directories (I've never understood why) so .gitignore is the placeholder that puts the directory there. I will probably end up adding some small data files and can remove it then. Until then I guess it's a bit awkward...

@jiffyclub
Copy link
Contributor

I like to put a README in there to explain why there's an empty directory.

right now only using the first 8 rows of the spec, but we're successfully simulating choices

This works by taking the UEC spreadsheet and keeping the coefficients almost formatted as-is.  I say almost because the filters and alt names had to be tweaked a bit to match Pandas conventions.  This commit is working but brings up a few questions that I will post as issues on github in a bit.  This is clearly not a final implementation but something to elicit early feedback.
@fscottfoti
Copy link
Contributor Author

A separate issue of this pull request is that it's terribly slow right now. Running on the full household table takes about 200s with only 8 columns (it looks like the spec has 150 columns). @jiffyclub Any ideas here? I will profile this but I'm guessing it's in the calls to "eval".

@fscottfoti
Copy link
Contributor Author

Just leaving a note to myself for todo items that this PR does NOT include for a complete reimplementation of auto ownership

  • Diagnose performance issues
  • Make sure we can compute all the columns from the spec (150 columns or so, not including those below)
  • This includes things like AUTOPEAKRETAIL, etc that are in the spec and need to be taken from the skims
  • And finally the "auto time savings" variable

@jiffyclub
Copy link
Contributor

I added UDST/urbansim@f88f665 to the dcm branch to speed up making the interaction dataset. It still takes 15 seconds, but that's better than 2 minutes. The runtime is dominated now by the doing the draws, e.g. https://github.com/synthicity/activitysim/pull/4/files#diff-ae8114eddeabee70b62a76c66a5b85a5R47.

@fscottfoti
Copy link
Contributor Author

Perfect - I'll try it out. I also just realized the place the fast choice code exists is in mnl.py - i.e. just don't pass returnprobs=True and it makes the choices for you the best way. Totally forgot about that. I'll give it a shot.

@fscottfoti
Copy link
Contributor Author

OK - with the fix you put into interaction, we went from 200s to 100s. I just committed the change to use the choice making from mnl.py and we're down to 43s. Whenever you get a chance, please do another profile of the latest commit and make sure there's not any other easy optimizations.

@jiffyclub
Copy link
Contributor

Now there are about 15 seconds going to data processing and 36 sec going to prediction. Within prediction ~17 seconds are going to describe and ~15 sec are going to mnl_interaction_dataset. Within mnl_interaction_dataset most of the time is going to pandas.merge, with a few seconds going to .take.

the big change here is that a lot of the columns could not be expressed as DataFrame.eval calls and so I'm directly making Python "eval" calls if the expression starts with an "@".  This is a proposal and not necessarily permanent functionality - something else to talk about.  I like the way this works though as it puts quite a bit of power directly in the spreadsheet for the model specs, much the same way it is now.  We have the option of never doing a DataFrame.eval call but these calls are a bit easier to express and are apparently much faster.

This checkin also has the performance improvements on the choices and converts windows newlines to regular newlines so you can see the spec on github now
@fscottfoti
Copy link
Contributor Author

OK on a server, and with the speedups, the auto ownership model is simulating in 50s. Still a few things to add which will slow this down a bit, but so far so good.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 6692d0d on adding-defaults into 5a68ed1 on master.

…ibility.csv

Easier to add the logsums than I thought it would be.

Also moved the standard methods into a new module - activitysim.py - will add unit tests and docstrings if we all think this is the right direction to be headed

Also added a global param to settings.yaml called household_sample_size to run with a subset of the total households when debugging
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 4e87889 on adding-defaults into 5a68ed1 on master.

description_name="Description",
expression_name="Expression"):
"""
Read in the excel file and reformat for machines
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excel!?! No thanks!

fscottfoti added a commit that referenced this pull request Jan 5, 2015
Adding defaults - this PR adds some simple and standard data sources, variables, and models to ActivitySim.  I'm going to start merging PRs even though this code is still subject to change somewhat in the future.
@fscottfoti fscottfoti merged commit 79593f8 into master Jan 5, 2015
@fscottfoti fscottfoti deleted the adding-defaults branch January 5, 2015 21:34
toliwaga added a commit that referenced this pull request Sep 26, 2017
fix issue in activitysim.abm.models.util.mode where reset_index level…
bstabler pushed a commit that referenced this pull request Oct 20, 2021
cross-border changes after rebase from upstream develop
jpn-- added a commit that referenced this pull request Mar 19, 2024
BayDAG Contribution #4: Joint Tour Destination Performance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants