-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scan whole maxmind database #23
Comments
I glanced at this and it seemed reasonable. To debug it, I'd probably start with one of the test databases and compare the output. It should be pretty straightforward to see what is going on. The Go reader also provides this functionality if looking at another implementation would be helpful. |
So I wanted to get a mapping of autonomous system numbers to names, and surprisingly this wasn't available in an easy to consume form (e.g. CSV) anywhere. So I gave this a stab, starting from the Perl code and then optimizing it and making the interface a bit more Pythonic. This is it: from ipaddress import IPv4Network, IPv6Network
def __iter__(self):
if self._metadata.ip_version == 4:
start_node = self._start_node(32)
start_network = IPv4Network((0, 0))
else:
start_node = self._start_node(128)
start_network = IPv6Network((0, 0))
search_nodes = [(start_node, start_network)]
while search_nodes:
node, network = search_nodes.pop()
if network.version == 6:
naddr = network.network_address
if naddr.ipv4_mapped or naddr.sixtofour:
# skip IPv4-Mapped IPv6 and 6to4 mapped addresses, as these are
# already included in the IPv4 part of the tree below
continue
elif int(naddr) < 2 ** 32 and network.prefixlen == 96:
# once in the IPv4 part of the tree, switch to IPv4Network
ipnum = int(naddr)
mask = network.prefixlen - 128 + 32
network = IPv4Network((ipnum, mask))
subnets = list(network.subnets())
for bit in (0, 1):
next_node = self._read_node(node, bit)
subnet = subnets[bit]
if next_node > self._metadata.node_count:
data = self._resolve_data_pointer(next_node)
yield (subnet, data)
elif next_node < self._metadata.node_count:
search_nodes.append((next_node, subnet)) Notes:
Note: the above code snippet is Copyright 2018 Faidon Liambotis and dual-licensed under 1) the Apache License, Version 2.0 (SPDX identifier Apache-2.0) as published by the Apache Software Foundation: https://www.apache.org/licenses/LICENSE-2.0, 2) the 0-clause BSD license (SPDX identifier 0BSD), as published e.g. by the Open Source Initiative: https://opensource.org/licenses/0BSD. |
At a high level, this looks great! The one thing that I am not sure about is the pruning of the IPv4-mapped IPv6 networks. Although the MaxMind databases do map these (and Also, if we do add this to the reader, I think we will want to update the C extension to also support it. |
Is there any news on this feature? |
There is no news on this feature. To be included in the reader, we would still need to address everything in my previous comment and add appropriate tests. In regards to skipping the aliased nodes, it would probably be better if it was implemented more similarly to the Go reader and also made to be optional (but defaulting to skipping makes sense). |
Thanks for the request. We have implemented this and it will be in our next release. |
Hi,
I'm currently working on a python project involving geolocalization. Unfortunately, there is no way to scan a whole database to reconstruct it in a custom external format.
The reference perl implementation actually does, via iterate_search_tree (https://github.com/maxmind/MaxMind-DB-Reader-perl/blob/3fade689fa12708981fe70c5419a12a55561508a/lib/MaxMind/DB/Reader/PP.pm).
I tried to implement a python version of this function, but I'm not sure if my results are coherent with what I should get. Here is the code :
When I call
iterate_search_tree
in a trivial script counting entries viadata_callback
, I just get crazy results : a total of 2M+ ipv4/ipv6 detected, whereas a trivial script counting entries in the reference perl implementation counts 780K of them (both ipv4 and ipv6) for the same database file. I spent time trying to figure out what could go wrong but I don't manage to debug it.I was wondering if you already thought about including this kind of feature in your package. If so, could you check what's wrong in my code and potentially include it in your reader ?
Thanks,
The text was updated successfully, but these errors were encountered: