Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How are lookalike domains handled? #56

Open
ThisIsMissEm opened this issue Sep 9, 2023 · 4 comments
Open

How are lookalike domains handled? #56

ThisIsMissEm opened this issue Sep 9, 2023 · 4 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@ThisIsMissEm
Copy link

For example, mastоdon.social isn't mastodon.social (the official instance), first domain is with a lookalike character for the first o in mastodon.social, so in punycode would be xn--mastdon-djg.social which is clearly different.

When Mastodon returns domain blocks from the API, they are normalised to punycode, so the API, despite accepting lookalike characters will result in them appearing as punycode in the response.

I had a look through the code, and from what I can tell there is no code for handling domain punycode normalisation, which may cause unexpected results with this tool if a source blocklist does not do punycode normalisation.

Note: As this project has neither a SECURITY.md file, nor the GitHub Security features enabled, I was not able to disclose this potential issue in a more responsible disclosure manner, without seeking out contributor email addresses (typically a privacy violation).

@jpwarren jpwarren added enhancement New feature or request help wanted Extra attention is needed labels Sep 10, 2023
@jpwarren
Copy link
Member

I'm not sure I understand the issue enough to know what to do about it. Could you please elaborate a little?

Is the risk that someone might think they're blocking a domain, but aren't? Or maybe block something else that looks similar but isn't the same?

And what behaviour should be expected? Should we add punycode normalisation so, no matter what gets imported, fediblockhole always operates on punycode normalised domains for its comparisons and upserts into instances?

Sorry to be dense. Just want to make sure I appreciate the issue properly.

(Reporting this publicly is fine. I'll have another look at setting up GitHub's security thing.)

@ThisIsMissEm
Copy link
Author

Yeah, I think normalisation using punycode would probably be a good idea, that way you're always comparing correctly. The risk is mostly in potential mismatches between the blocklist and the instance, so yeah, someone things they're blocking a bad instance but they're actually not.

@jpwarren
Copy link
Member

jpwarren commented Nov 1, 2024

If I understand this issue correctly, the risk is:

  1. Someone puts a lookalike domain into their blocklist. Probably not an issue if it's coming via API from a Mastodon instance, because those domains are punycode normalised, but if it's a text file that could be manually done and designed to mislead.
  2. You read in the blocklist from this source and block something you didn't mean to, or believe you're blocking one thing but are actually blocking something else and thus are not blocking the thing you mean to.
  3. The risk is highest for new blocklists from sources you are only just starting to trust.
  4. There is also a risk if a blocklist you already trust is somehow compromised (unauthorised update after a breach, or an insider who decides to be evil today). It will be more difficult to detect the incorrect block because of the lack of punycode normalisation.

The remedy would be to normalise with punycode somehow. That will make it easier to detect the attempt at misleading people.

Where should this normalisation occur?

Options include:

  1. Whenever a comparison is made between domains.
  2. Whenever domains are loaded in, or saved out.
  3. Both.

I invite comment on which approach we should take, and encourage example implementations and PRs.

@ThisIsMissEm
Copy link
Author

I'd be inclined to inspect the block list, and if any domain in it when punycode encoded doesn't match the entry's domain, then fail the import. i.e., force all domains to be punycode encoded in blocklists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants