-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discrete Choice Models #134
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Dec 9, 2014
We'd hardcoded that when doing prediction a DCM would select only the first chooser from the choosers table in order to get a PDF used when assigning choosers to alternatives. I've removed that so now all choosers go into making the interaction dataset and probabilities come back calculated one per chooser across all alternatives. Obviously we'll need to figure out a way to make this manageable for doing LCMs. I think the unit_choice method still needs work, bits and pieces of the code seem assume that there will be only one set of probabilities for all choosers, not probabilies per chooser.
Instead of separately returning probabilities and alternatives information this groups them all together. The probabilities have a MultiIndex with chooser IDs on the outside and alternative IDs on the inside.
We need to make choices per chooser because each chooser has a different probability across the alternatives. For many discrete choice situations we *don't* want to remove alternatives that are chosen, so remove that functionality.
Note that none of this functionality is implemented yet, this commit introduces the arguments to calls, doc strings, yaml, and tests. The options will allow users to specify how they want probabilities calculated and choices made. This will allow users to choose between calculating probabilities for all choosers or just one, and if they want choices made per chooser or for all choosers at once. Users can also provide their own functions for calculating those things.
Added support for modes 'single_chooser' and 'full_product' when calculating probabilities. These actually affect the merging of choosers and alternatives into an interaction dataset. In 'single_chooser' only the first chooser in the choosers table (after filtering) is used to construct the merged interaction table. This is the same behavior as UrbanSim's previous LCM class. In 'full_prodct' mode all choosers are merged with all alternatives.
Users of MNLDiscreteChoiceModel can choose either 'individual' or 'aggregate' mode for matching choosers to alternatives during prediction. In 'individual' mode a choice is made individually for every chooser and each chooser has access to all alternatives. In 'aggregate' mode choices are made for every chooser at the same time, which implies that alternatives are unavailble to others once chosen and that all choosers have the same probabilities over alternatives. 'aggregate' mode should only be used with 'single_chooser' probability mode.
This controls whether alternatives are removed from the alternatives pool between doing prediction for different segments. When doing LCMs (e.g. probability_mode='single_chooser' and choice_mode='aggregate') this should be set to True. For doing something like automobile ownership this should be False. False is the default.
.loc is really slow with large indexes. We can do things much faster using location based indexing instead of label based indexing. Here I'm replacing a .loc with a .take.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a big set of changes associated with ActivitySim/activitysim#3.
The goal here is to generalize our existing location choice model classes into discrete choice models with varying capabilities. The LCMs had baked in some assumptions that made them inappropriate for something like automobile ownership, even though the underlying MNL code is fully compatible with general discrete choice modeling. These some of the assumptions:
To support LCMs we need to keep those capabilities, but we also need to be able to calculate probabilities and make choices on a per-chooser basis, as well as not modify alternatives between making choices for different segments.
In this PR I've changed all class names from "LocationChoice" to "DiscreteChoice" and cleaned up docstrings and variable names that referred to locations. I've also add some new options:
probability_mode
(can besingle_chooser
orfull_product
) andchoice_mode
(can beindividual
oraggregate
) for controlling how probabilities are calculated and choices are made. At the group level there's a new optionremove_alts
that controls whether alternatives are filtered after performing prediction for a segment.The defaults are
full_product
,individual
, andFalse
forprobability_mode
,choice_mode
, andremove_alts
, respectively. These are the settings you'd use for something like automobile ownership.For something like LCMs you'd set those to
single_chooser
,aggregate
, andTrue
.The Travis runs are failing right now because these changes are breaking sanfran_urbansim. I'll make a PR on there shortly.