Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to import Spamhaus DROP lists in the new json format #8107

Closed
2 tasks done
sebvonhelsinki opened this issue Dec 4, 2024 · 10 comments
Closed
2 tasks done

Ability to import Spamhaus DROP lists in the new json format #8107

sebvonhelsinki opened this issue Dec 4, 2024 · 10 comments
Assignees
Labels
feature Adding new functionality
Milestone

Comments

@sebvonhelsinki
Copy link

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Is your feature request related to a problem? Please describe.

Spamhaus offers a free list of known malicious IP addresses, the DROP list. Until recently, this list was distributed as a .txt file available under a static URL. In the OPNSense documentation, there is a dedicated manual how to import and configure the list in an OPNSense: https://docs.opnsense.org/manual/how-tos/drop.html

The "problem": Spamhaus has opted to provide the list in a json format in the future. For now, both list types (txt table and json) exist side by side, but at some point in the future the txt list will be deprecated. In the words of Spamhaus:

For long-term users of the DROP files in text format, we recommend you update your configuration with the above JSON files as soon as your cycles allow. If you require continued long-term use of a text file, the jq command can always be used to convert the JSON.
N.B. The text files are still being populated however, in time, these will be deprecated; users will be notified with ample notice before deprecation takes place.

When this happens, there won't be a way to import the list into OPNSense. As of now, the manual still works since it has the old hard links to the .txt included. But without the (now inofficial) link form the manual, adding the list into OPNSense is no longer possible.

Describe the solution you like

My favourite solution would be an "URL JSON (IPs)" type in OPNSense, that queries a json file from a remote source, parses it, and then populates the alias with network addresses.

The alias could be structured simliar to the current "URL Table (IPs)" alias type. It would be configured by providing the URL to a remote source, and a "key". The OPNSense then queries the remote URL, expecting a valid json as answer. It then iterates over every entry in the json, and for every entry extracts the value of the attribute specified by "key" into the alias.

That way, we have a more generalized importer that is able to parse the new Spamhaus DROP lists, but also any other json-formatted list of IP-ranges.

Describe alternatives you considered

  • A user in the OPNsense forum has provided a script that parses the json into a text file readable by the old "URL Tables (IPs)" alias type. This could be used to either provide the list from an "official" OPNSense server in the old format even after Spamhaus deprecates the .txt-files. This has the downside of copyright deliberations and the need for additional central infrastructure.
  • OPNSense could add the Spamhaus DROP list directly as an alias and internalize the handling. This would increase the base bloat of OPNSense, make it reach out to additional servers by default, and take away configuration options from operators.

Additional context

Forum-Post with the scraper script: https://forum.opnsense.org/index.php?topic=40660.0
Forum-Post where other users request the same feature: https://forum.opnsense.org/index.php?topic=41210.0

@AdSchellevis AdSchellevis self-assigned this Dec 4, 2024
@AdSchellevis AdSchellevis added the feature Adding new functionality label Dec 4, 2024
@fichtner fichtner added this to the 25.1 milestone Dec 6, 2024
@fichtner fichtner modified the milestones: 25.1, 25.7 Jan 14, 2025
fichtner pushed a commit that referenced this issue Jan 27, 2025
…arses json payloads and extracts addresses, closes #8107

While here, also fix a minor issue in #8238 to calculate a proper alias has value when auth properties are specified.

(cherry picked from commit 03a8812)
@gglockner
Copy link

Nice idea to implement this feature. Unfortunately, only some JSON files are compatible with the implementation in 03a8812. For example, this will not work with JSON data for AWS IP ranges, in part because the current implementation transforms JSON data line-by-line rather than parsing JSON data prior to transforming.

I'm capable and willing to contribute code for this, but first I need some guidance from the OPNsense team or at least @AdSchellevis. Specifically, I can imagine 2 different implementations:

Implementation Pros Cons
Calling the jq JSON processor Full JSON parsing using a very popular tool need to install jq package by default; small security risk
A Python data transformation library Easier and cleaner to package incomplete or non-standard transformation syntax; potential incompatibility in open source license

I'm happy to work on it once I get consensus on which implementation I should do. Thanks.

@AdSchellevis
Copy link
Member

@gglockner I looked at both options as well, but in terms of maintenance they aren't great. I don't mind implementing minimal parts of a transformation language to fit these types of input as well, but realistically, there will always be formats that won't be parsed (or require too difficult input patterns for a user to grasp).

@gglockner
Copy link

Thanks for the quick reply, @AdSchellevis.

What do you think of including a Python library like jq_python:

  1. It's pure Python written as a single file that can be easily included into OPNsense core
  2. It implements a subset of the jq syntax
  3. It has no license so it should be compatible with the OPNsense license

There are some minor code issues with jq_python. So I forked it and started to fix it on my local machine; OPNsense could use my fork.

If this sounds good to you, I'll go ahead and finish my updates to jq_python, then integrate that with OPNsense. Let me know.

@AdSchellevis
Copy link
Member

It will add another dependency which likely will loose support sooner than later, in which case our team eventually is forced to seek alternatives or fork it again. We have seen this a lot in various Javascript projects unfortunately, we are very careful adding dependencies in general.

Maybe it's better to start with a ticket describing the patterns we would like to support (and for which reasons), I'm open for suggestions what to add functionally, but don't seek to implement a swiss armyknife.

The current implementation is intentionally limited, but easy to extend.

@gglockner
Copy link

Sure. What if I removed the core from jq_python (only about 50 lines of code) and added it to src/opnsense/scripts/filter/lib/alias/uri.py? That would support more jq syntax and allow me to import AWS IP addresses from the AWS JSON file. Only downside I see is that the OPNsense team would become responsible for these ~50 lines of Python.

@AdSchellevis
Copy link
Member

I agree with the initial goal (being able to import the aws list), but rather have a ticket first describing the "selector" to use and expected outcome so we can assess what we need.

A pull request to accompany the ticket is also fine, but for some reason I have the idea we can support this specific type of imports with only a couple of lines of code.

@fichtner
Copy link
Member

fichtner commented Feb 2, 2025

This also quickly ends in a vendor lock in (somebody asking for ASN data in JSON too) and at some point the same situation as DynDNS is going to be the norm. Something breaks: we are forced to fix it, somebody wants another list with another format, rinse repeat. ;(

@AdSchellevis
Copy link
Member

well, most of these files are structured similarly, which is why I'm more or less ok with extending the functionality, but only if we can easily do so and have some valid use-cases.

@gglockner
Copy link

This also quickly ends in a vendor lock in (somebody asking for ASN data in JSON too) and at some point the same situation as DynDNS is going to be the norm. Something breaks: we are forced to fix it, somebody wants another list with another format, rinse repeat. ;(

That’s an argument for using the full jq interpreter as a backend filter.

Per the request of @AdSchellevis, I will open a separate issue to discuss my use case.

@sebvonhelsinki
Copy link
Author

Thank you for implementing this, I'll see if I can find the time to write the corresponding documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Adding new functionality
Development

No branches or pull requests

4 participants