Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrete Choice Models #134

Merged
merged 16 commits into from
Feb 24, 2015
Merged

Discrete Choice Models #134

merged 16 commits into from
Feb 24, 2015

Conversation

jiffyclub
Copy link
Member

This is a big set of changes associated with ActivitySim/activitysim#3.

The goal here is to generalize our existing location choice model classes into discrete choice models with varying capabilities. The LCMs had baked in some assumptions that made them inappropriate for something like automobile ownership, even though the underlying MNL code is fully compatible with general discrete choice modeling. These some of the assumptions:

  • only one chooser was used for calculating probabilites
  • choices were made for choosers in aggregate because all choosers had the same probabilities, locations are unavailable once chosen, and it results in better performance
  • locations were removed from the alternatives pool at the group level because those locations were no longer available to others once chosen

To support LCMs we need to keep those capabilities, but we also need to be able to calculate probabilities and make choices on a per-chooser basis, as well as not modify alternatives between making choices for different segments.

In this PR I've changed all class names from "LocationChoice" to "DiscreteChoice" and cleaned up docstrings and variable names that referred to locations. I've also add some new options: probability_mode (can be single_chooser or full_product) and choice_mode (can be individual or aggregate) for controlling how probabilities are calculated and choices are made. At the group level there's a new option remove_alts that controls whether alternatives are filtered after performing prediction for a segment.

The defaults are full_product, individual, and False for probability_mode, choice_mode, and remove_alts, respectively. These are the settings you'd use for something like automobile ownership.

For something like LCMs you'd set those to single_chooser, aggregate, and True.

The Travis runs are failing right now because these changes are breaking sanfran_urbansim. I'll make a PR on there shortly.

We'd hardcoded that when doing prediction a DCM would
select only the first chooser from the choosers table
in order to get a PDF used when assigning choosers
to alternatives.

I've removed that so now all choosers go into making the
interaction dataset and probabilities come back calculated
one per chooser across all alternatives. Obviously we'll need
to figure out a way to make this manageable for doing LCMs.

I think the unit_choice method still needs work, bits and pieces
of the code seem assume that there will be only one set of
probabilities for all choosers, not probabilies per chooser.
Instead of separately returning probabilities and alternatives
information this groups them all together.
The probabilities have a MultiIndex with chooser IDs on the
outside and alternative IDs on the inside.
We need to make choices per chooser because each chooser has
a different probability across the alternatives.
For many discrete choice situations we *don't* want to remove
alternatives that are chosen, so remove that functionality.
Note that none of this functionality is implemented yet,
this commit introduces the arguments to calls, doc strings,
yaml, and tests.

The options will allow users to specify how they want probabilities calculated
and choices made. This will allow users to choose between
calculating probabilities for all choosers or just one,
and if they want choices made per chooser or for all choosers at once.
Users can also provide their own functions for calculating those things.
Added support for modes 'single_chooser' and 'full_product'
when calculating probabilities. These actually affect the merging
of choosers and alternatives into an interaction dataset.
In 'single_chooser' only the first chooser in the choosers table
(after filtering) is used to construct the merged interaction table.
This is the same behavior as UrbanSim's previous LCM class.
In 'full_prodct' mode all choosers are merged with all alternatives.
Users of MNLDiscreteChoiceModel can choose either
'individual' or 'aggregate' mode for matching choosers
to alternatives during prediction.

In 'individual' mode a choice is made individually for every chooser
and each chooser has access to all alternatives.

In 'aggregate' mode choices are made for every chooser at the same time,
which implies that alternatives are unavailble to others once chosen
and that all choosers have the same probabilities over alternatives.
'aggregate' mode should only be used with 'single_chooser'
probability mode.
This controls whether alternatives are removed from the alternatives
pool between doing prediction for different segments.
When doing LCMs (e.g. probability_mode='single_chooser'
and choice_mode='aggregate') this should be set to True.
For doing something like automobile ownership this should be False.
False is the default.
.loc is really slow with large indexes.
We can do things much faster using location based indexing
instead of label based indexing.
Here I'm replacing a .loc with a .take.
jiffyclub added a commit that referenced this pull request Feb 24, 2015
@jiffyclub jiffyclub merged commit a94f05e into master Feb 24, 2015
@jiffyclub jiffyclub deleted the dcm branch February 24, 2015 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant