Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add config option to prevent media downloads from listed domains. #15197

Merged
merged 17 commits into from
May 9, 2023

Conversation

turt2live
Copy link
Member

@turt2live turt2live commented Mar 2, 2023

See config documentation for more details on the feature description.

This could probably be implemented as an antispam or similar module, however this is also tooling that we'd like available to everyone. Instead of trying to bundle a module into every deployment, let's just put it in the core project :)

This is Trust & Safety (T&S) tooling to combat a specific type of abuse. There are no active incidents related to this, but is something we'd like available to us just in case.

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Pull request includes a sign off
  • Code style is correct
    (run the linters)

tests/rest/media/test_domain_blocking.py Show resolved Hide resolved
synapse/media/media_repository.py Show resolved Hide resolved
tests/rest/media/test_domain_blocking.py Outdated Show resolved Hide resolved
@Twi1ightSparkle
Copy link

Will this also support wildcards? I.e.,

prevent_downloads_from:
  - '*.example.org'

Or, would

prevent_downloads_from:
  - example.org

automatically deny example.org and all possible subdomains?

@davidegirardi
Copy link

Wildcard are really important because server1.example.tld and server2.example.tld could be matrix servers for 2 different domains. Moreover, something.domain.tld and somethingelse.something.domain.tld could be separated zones managed by different organisations.

@turt2live
Copy link
Member Author

Wildcards would be a different PR.

@davidegirardi
Copy link

Since you block servers and not domains (D'OH!) it makes perfect sense.

@turt2live
Copy link
Member Author

it's more the cost of implementation versus the benefit: we don't see a particular profile of abuse that needs wildcard bans here, making the implementation+maintenance effort (and associated performance loss) not worth it. If needed, we can always add it in a pinch.

@davidegirardi
Copy link

I think a list without wildcards makes perfect sense since we target servers, not domains. There's no ambiguity this way.

@turt2live turt2live marked this pull request as ready for review March 9, 2023 18:35
@turt2live turt2live requested a review from a team as a code owner March 9, 2023 18:35
@reivilibre
Copy link
Contributor

I think a list without wildcards makes perfect sense since we target servers, not domains. There's no ambiguity this way.

though it'd possibly be quite easy to serve up media on wildcard domains. I guess this is just a theoretical risk until someone does, though.

@clokep clokep self-assigned this Mar 29, 2023
changelog.d/15197.feature Outdated Show resolved Hide resolved
docs/usage/configuration/config_documentation.md Outdated Show resolved Hide resolved
docs/usage/configuration/config_documentation.md Outdated Show resolved Hide resolved
synapse/media/media_repository.py Show resolved Hide resolved
synapse/rest/media/thumbnail_resource.py Outdated Show resolved Hide resolved
tests/rest/media/test_domain_blocking.py Outdated Show resolved Hide resolved
synapse/config/repository.py Outdated Show resolved Hide resolved
@@ -1768,6 +1768,30 @@ Example configuration:
max_image_pixels: 35M
```
---
### `prevent_media_downloads_from`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking about the name. This is a block list, if you later want to use the concept with allow lists, you have to rename the parameter.

Wouldn't it be better to adapt the name to the current parameters for allowing and disallowing federation or url previews?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow lists aren't currently planned for this feature - I'd favour a config migration (deprecation) later on if we decide to go that route.

Copy link
Member

@clokep clokep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🎉

@clokep clokep merged commit ab4535b into develop May 9, 2023
@clokep clokep deleted the travis/block-media-by-domain branch May 9, 2023 18:08
@catfromplan9
Copy link

Can clients still pull the media from the host server? I have people using matrix.org for abuse and I do not want to prevent matrix.org downloads nor do I want to email matrix.org every time theres illegal media

@catfromplan9
Copy link

The type of abuse described can happen, but it will likely never be media coming from an abusive server. To really prevent this attack, you would need to index every single public matrix server and block media federation. This PR is a great step towards a solution but it needs more fine tuned control such as exists on many ActivityPub platforms

@catfromplan9
Copy link

catfromplan9 commented May 12, 2023

I think adding a boolean federate media flag in addition to this would be beneficial in cases where attackers are determined. Then implement a client option as I describe in my Element feature suggestion to disable fetching media federated or a server request to client to fetch media using the server the media is located on

MatMaul pushed a commit that referenced this pull request May 12, 2023
…5197)

This stops media (and thumbnails) from being accessed from the
listed domains. It does not delete any already locally cached media,
but will prevent accessing it.

Note that admin APIs are unaffected by this change.
@turt2live
Copy link
Member Author

Clients should not be reaching out to the original server the media was hosted on. That's abusive in its own right.

This PR is to target a different abusive case than the one you're describing. I can't go into details here though, sorry.

Please report any illegal material found originating from matrix.org to abuse@matrix.org via email. It's not the most efficient system at the moment, but it is the most reliable.

@catfromplan9
Copy link

Clients should not be reaching out to the original server the media was hosted on. That's abusive in its own right.

This PR is to target a different abusive case than the one you're describing. I can't go into details here though, sorry.

Please report any illegal material found originating from matrix.org to abuse@matrix.org via email. It's not the most efficient system at the moment, but it is the most reliable.

There is honestly not much point to even reporting it. I will do it, sure, but these are throwaway accounts doing this. In the time it takes for a report to process, the burner accounts already did their jobs

@turt2live
Copy link
Member Author

The team does pattern analysis in these cases, but this is highly offtopic for this PR. Not reporting burner accounts just means they're able to get away with it.

@catfromplan9
Copy link

Yeah, sorry for flooding this PR with off topic. I just wish to see an option to disable media downloading from my server. A way to host my matrix without doubling as a file upload site and having to deal with the burdens associated with that. Maybe eventually, I can hope a PR that does what im looking for is merged

@JokerGermany
Copy link

Can I use this if i only want to upload media from my server, but don't want to download media from any other server? (The Server is only used for bridges/bots)

@turt2live
Copy link
Member Author

Folks, I'm locking this. Please use the existing support rooms for questions/future extensions.

@matrix-org matrix-org locked as resolved and limited conversation to collaborators May 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants