-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set mgwr bw #15
Set mgwr bw #15
Conversation
@@ -200,8 +200,9 @@ def __init__(self, coords, y, X_loc, X_glob=None, family=Gaussian(), | |||
self.search_params = {} | |||
|
|||
def search(self, search_method='golden_section', criterion='AICc', | |||
bw_min=None, bw_max=None, interval=0.0, tol=1.0e-6, max_iter=200, init_multi=True, | |||
tol_multi=1.0e-5, rss_score=False, max_iter_multi=200): | |||
bw_min=None, bw_max=None, interval=0.0, tol=1.0e-6, max_iter=200, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what situation would a user want to have different max iters for multi and single?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, I've never had to tinker with either, but they do control different loops. max_iters is for each individual golden section and max_iters_multi is for the gam backfitting. Probably safe to set them to the same value, but if you did want to change one I am not sure you would want to change both.
just 1 q. @Ziqi-Li @weikang9009 feel free to read & feedback |
""" | ||
Multiscale GWR bandwidth search procedure using iterative GAM backfitting | ||
""" | ||
if init: | ||
if init is None: | ||
bw = sel_func(bw_func(y, X)) | ||
optim_model = gwr_func(y, X, bw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question.
Wondering if the initial GWR optim_model takes account of the multi_bw_min/max when searching for its bandwidth. Say the user set the multi_bw_min/max to be [100] and [150] but the optimal bw for the initial GWR might be 50. Does it matter in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't. It will either find whatever bandwidth selection deems to be the optimal bandwidth or it will need to be set by the user using init_multi
. In either case, these are different parameterizations that do not interact. One is the bandwidth for the initial regression to produce the first round of partial residuals and the other is for the bandwidths of the GAM components after the initial regression. The value of the initial GWR BW could certainly affect the final outcome of the MGWR GAM routine, but does not seem to be very sensitive. Also, I anticipate (1) that most people will leave init_multi=None
since this aligns with current best practices and (2) that the primary reason to use set the multi_bw_min
and multi_bw_max
would be to set them to the save value in which case the initial value doesn't seem to affect the end results. This was the use-case I introduced this functionality for, since we needed it for the simulations in the inference paper where the BW needs to be the same for each MGWR iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
This PR introduces functionality to allow the user to set min and max bw ranges for the mgwr as can be done for gwr. In the case that these arguments are equivalent it allows the user to "predefine" mgwr bandwidths. The introduces the
multi_bw_min
andmulti_bw_max
parameters, which are similar to thebw_min
andbw_max
parameters used for gwr search, except these take lists instead of single values as arguments. If the list in composed of a single argument then it will automatically be applied to each covariate and if the list supplied a value for each of k covariates (including a potential intercept) then each covariate will have its own min/max. Any number other than 1 or k will throw an error.This is the simplest way I could think to do this for now. Ideally, though we wouldn't need a set of parameters for gwr and a set for mgwr, so the API wouldn't get too complex.