Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Reverse Geocode #41

Open
kshailove opened this issue Dec 23, 2013 · 11 comments
Open

Feature Request: Reverse Geocode #41

kshailove opened this issue Dec 23, 2013 · 11 comments

Comments

@kshailove
Copy link

Thanks for the great module.

I have one more requirement though. I need to find out the city details from the latitude and longitude values. Since the maxmind database already contains tables which contain these values, should that be doable?
I found some node modules which are calling google apis for this, however, I am trying to see if I can achieve this without calling a web service and can do it offline.

@bluesmoon
Copy link
Collaborator

Unfortunately, you might need boundary coordinates to do that since cities
are not rectangular. The database does not contain bounding boxes.

@kshailove
Copy link
Author

Sorry, I got confused with the answer. Let me elaborate the question.
I realize that the table GeoLiteCityocation.csv already has this information. What I intend to do is below:-

  1. Input lat, long
  2. Iterate through each row in this table.
  3. Calculate the distance of the point in current row (lat1, long1) from the point (lat, long) using Harversine formula.
  4. Report the row with least distance.
  5. If required, sort based on latitude/longitude and do a binary search for faster performance.

This can be done in csv, the downside is performance though. Since I don't have any idea around how the binary dat format organizes the data, I would need some help to implement this. Would really appreciate a help on this.

elcct added a commit to elcct/node-geoip that referenced this issue Dec 23, 2013
@elcct
Copy link
Contributor

elcct commented Dec 23, 2013

That is interesting problem solution of I decided to implement.

First it creates trie of geo hashes computed from locations.

Introduces new method info that takes latitude and longitude as a parameters.

How it works:

  1. Creates geo hash from provided parameters, then creates 4 adjacent geo hashes
  2. Searches the trie for the hashes above
  3. Picks a result with a shortest distance computed via Harvesine formula to the provided parameters
  4. Returns info dictionary with country, city, region, ll and distance

Example:

var geoip = require('geoip-lite');

var geo = geoip.info(51.5072, 0.1275);

console.log(geo);

Result:

{ country: 'GB',
  city: 'Rainham',
  region: 'E4',
  ll: [ 51.5, 0.1833 ],
  distance: 3.944303662415857 }

If you think that is useful I could create a pull request

@kshailove
Copy link
Author

Thanks, this will definitely be helpful, have you checked in into node module as well? I would request you to do that, please

Meanwhile, I did some intermediate research around this and found that kd-tree is also useful for such scenarios. I created an intermediate solution for myself with node module named 'kdt' and below are the stats (tested with GeoLiteCity-Location.csv having around 470,000 records):-

  1. CSV load time: around 20 seconds on my ubuntu (1.5 GB RAM, running on VMBox)
  2. Tree creation time: around 25 seconds.
  3. Average lookup time: 100 microseconds.

As you can see, the penalty is mostly around CSV read and tree creation. Unfortunately I am not good at coding for binary data, so the solution you have implemented will definitely worth exploring and I will also do a performance comparison and will let you know the results.

Also, can you please add areacode, metrocode and postal code wherever available.

@kshailove
Copy link
Author

Sorry for repeated messages. I did a quick testing by replacing geoip.js and utils.js and putting geohash.js in the lib folder and found that there is a performance regression while loading the module, it now takes more than 40 seconds to load (previously it was quick, less than a second). Please see below

Loading module started: Tue Dec 24 2013 17:46:43 GMT+0530 (IST)
Loading module finished: Tue Dec 24 2013 17:47:27 GMT+0530 (IST)
IP Lookup started: Tue Dec 24 2013 17:47:27 GMT+0530 (IST)
IP Lookup finished: Tue Dec 24 2013 17:47:27 GMT+0530 (IST)
Geo lookup started: Tue Dec 24 2013 17:47:27 GMT+0530 (IST)
Geo Lookup finished: Tue Dec 24 2013 17:47:27 GMT+0530 (IST)

@elcct
Copy link
Contributor

elcct commented Dec 24, 2013

Indeed the time it loads increases significantly. I have been thinking on how to optimize this. Sorry for limited response, but I am quite busy today.

@kshailove
Copy link
Author

Hi elcct,
Did you get time to look at it, would really appreciate a help.
BTW, one of the ideas that came to my mind is save the trie structure to a file and then load it, no need to create the trie everytime, since the data will not frequently change.

@elcct
Copy link
Contributor

elcct commented Jan 2, 2014

Sorry I was on a break. I tried some other implementations of binary tree, but it doesn't make too much of a difference in terms of loading speed.
I was able to speed up loading time by limiting the depth of the tree to say 4 first characters of the hash, but that would sacrifice look-up time.

Probably the best solution would be to use a database like MongoDB (Mongo has geo index built-in) - not sure if that is considered for this project though.

@tommychang1983
Copy link

Hi, is there any update or update about the performance for this feature? :)
I'm also interested in such feature.

@mattvb91
Copy link

Currently have the same requirement to do a reverse lookup based on coordinates.

@bluesmoon
Copy link
Collaborator

This isn't a relevant feature for geo-ip. There is no IP involved. I'm sure there's a different module that is more appropriate.

Secondly, the implementation isn't correct. Haversine distance will tell you which city center is closest, but it does not mean you are actually in that city since cities aren't circular in shape. You'll may not even be in the same country or continent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants