Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group related domains together in before spoofing 3rd-party referrers #3194

Closed
fmarier opened this issue Feb 1, 2019 · 6 comments
Closed
Labels
closed/not-actionable feature/shields The overall Shields feature in Brave. priority/P4 Planned work. We expect to get to it "soon". webcompat/not-shields-related Sites are breaking because of something other than Shields.

Comments

@fmarier
Copy link
Member

fmarier commented Feb 1, 2019

In order to fix a whole class of webcompat issues (e.g. #1356), we could group related domains together when we decide whether or not a request is 3rd-party.

For example, google.com and googleapis.com both belong to Google. We're not really "leaking" the referrer if Google can see both requests anyways by virtue of being the recipient in both cases.

Mozilla uses a list of entities for a similar purpose in their tracking protection. We could probably reuse that list.

@diracdeltas also suggested using it for some other 3rd-party checks in Shields.

@fmarier fmarier added feature/shields The overall Shields feature in Brave. webcompat/not-shields-related Sites are breaking because of something other than Shields. labels Feb 1, 2019
@diracdeltas
Copy link
Member

currently these are the shields features which have some awareness of 3rd-partiness:

  • Ad Block / Tracking Protection - when enabled, does not block ads/trackers if they are first party to the top-level page (i think?)
  • Cookie blocking - there exists a 3p cookie block setting; also blocks referer and local storage mechanisms
  • Fingerprinting protection (AKA device recognition) - there exists a 3p block setting which just blocks the use of fingerprinting APIs when they are done in a 3rd party iframe

I think for all of these it would make sense to change "1st party" to "either 1st party or in the same entity group as the top-level page"

@tildelowengrimm
Copy link
Contributor

tildelowengrimm commented Feb 6, 2019

Would we rely on Mozilla to maintain those groups, or would we want to depend on same-site or something?

@fmarier
Copy link
Member Author

fmarier commented Feb 6, 2019

I'm not sure what "same-site" you're referring to, but last time I checked there was no way to programmatically determine this in a safe way. That might be something that the research team could investigate.

I was thinking that we could start by using the same list as Mozilla, potentially adding to it since we block more than just the Disconnect list.

@tildelowengrimm
Copy link
Contributor

Maintaining it manually ourselves with help from Mozilla seems like a workable plan, though it may be a bit of work to keep up-to-date.

@tildelowengrimm tildelowengrimm added the priority/P4 Planned work. We expect to get to it "soon". label Feb 6, 2019
@fmarier
Copy link
Member Author

fmarier commented Feb 16, 2019

Since I wrote a script to parse our .dat files as part of testing brave/tracking-protection#28, I decided to write a parser for the Mozilla entity list and compare it with the one we already generate from the Disconnect blacklist:

  • entries in the our entity list: 1026
  • entries in the Mozilla entity list: 1910

That's almost double the number of entries. Looking at the diff though, it's not just extra entries, there are also properties that are missing from the Mozilla entity list:

  • whos.amung.us: amung.us
  • wiredminds.com: wiredminds.com,wiredminds.de
  • xplusone.com: ru4.com,xplusone.com
  • ybrantdigital.com: addynamix.com,adserverplus.com,oridian.com,ybrantdigital.com
  • yesads.com: yesads.com
  • yoggrt.com: yoggrt.com

and some that are incomplete:

-yahoo.com: address.yahoo.com,adinterax.com,adrevolver.com,adserver.yahoo.com,advertising.yahoo.com,alerts.yahoo.com,analytics.yahoo.com,avatars.yahoo.com,bluelithium.com,buzz.yahoo.com,calendar.yahoo.com,dapper.net,edit.yahoo.com,interclick.com,legalredirect.yahoo.com,login.yahoo.com,mail.yahoo.com,marketingsolutions.yahoo.com,my.yahoo.com,mybloglog.com,notepad.yahoo.com,overture.com,pulse.yahoo.com,rightmedia.com,rmxads.com,rocketmail.com,secure-adserver.com,thewheelof.com,webmessenger.yahoo.com,yieldmanager.com,yieldmanager.net,yldmgrimg.net,ymail.com
+yahoo.com: adinterax.com,adrevolver.com,bluelithium.com,dapper.net,flickr.com,flurry.com,interclick.com,luminate.com,mybloglog.com,overture.com,pixazza.com,rightmedia.com,rmxads.com,rocketmail.com,secure-adserver.com,staticflickr.com,tumblr.com,yahoo.co.jp,yahoo.com,yahooapis.com,yahooapis.jp,yahoofs.com,yieldmanager.com,yieldmanager.net,yimg.com,yimg.jp,yldmgrimg.net,ymail.com,yuilibrary.com,zenfs.com
-yandex.com: adfox.yandex.ru,an.yandex.ru,awaps.yandex.ru,mc.yandex.ru,moikrug.ru,web-visor.com,yandex.ru/clck/click,yandex.ru/clck/counter,yandex.ru/cycounter,yandex.ru/portal/set/any,yandex.ru/set/s/rsya-tag-users/data
+yandex.com: api-maps.yandex.ru,moikrug.ru,web-visor.com,yandex.by,yandex.com,yandex.com.tr,yandex.ru,yandex.st,yandex.ua

Looking at the Yahoo! ones, here are the resources that are present in the .dat file but not in the Mozilla entity list:

  • address.yahoo.com (Social)
  • adserver.yahoo.com (Advertising)
  • advertising.yahoo.com (Advertising)
  • alerts.yahoo.com (Social)
  • analytics.yahoo.com (Analytics)
  • avatars.yahoo.com (Social)
  • buzz.yahoo.com (Social)
  • calendar.yahoo.com (Social)
  • edit.yahoo.com (Social)
  • legalredirect.yahoo.com (Social)
  • login.yahoo.com (Social)
  • mail.yahoo.com (Social)
  • marketingsolutions.yahoo.com (Advertising)
  • my.yahoo.com (Social)
  • notepad.yahoo.com (Social)
  • pulse.yahoo.com (Social)
  • thewheelof.com (Advertising)
  • webmessenger.yahoo.com (Social)

Looking at one of these, analytics.yahoo.com, it was part of the initial upload of the Disconnect list back in 2015 (Disconnect, Mozilla) but it's not clear to me why that's not part of the entity list. That same commit also added adserver.yahoo.com, also missing from the Mozilla entity list.

Looking at the properties missing from the Mozilla list, I found that ybrantdigital.com is still a tracker in the latest version of the Disconnect list and has been there since the original upload (Disconnect, Mozilla, but it got removed from the properties section in 2017 in this pull request without a comment as to why that is.

wiredminds.com is similarly missing from the Mozilla list but present in the .dat file, but in this case, it has never actually been listed as a property in the Mozilla entity list, probably because, while it's down at the moment, it used to redirect to wiredminds.de.

Bottom line is that while there are differences between our entity list and Mozilla's, some of which make sense and some of which are harder to explain or possibly mistakes, my guess is that we would be better off using their list since it covers a lot more web properties. We could suggest fixes to them if we notice missing entries with a webcompat impact.

@pes10k
Copy link
Contributor

pes10k commented Apr 2, 2021

Closing bc we no longer use any such determinations or lists when deciding referrer policy #10825

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed/not-actionable feature/shields The overall Shields feature in Brave. priority/P4 Planned work. We expect to get to it "soon". webcompat/not-shields-related Sites are breaking because of something other than Shields.
Projects
None yet
Development

No branches or pull requests

4 participants