Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move hardcoded exceptions rules to more specific rules and unify exception handling #3475

Closed
6 tasks done
bbondy opened this issue Feb 24, 2019 · 4 comments · Fixed by brave/brave-core#1770
Closed
6 tasks done

Comments

@bbondy
Copy link
Member

bbondy commented Feb 24, 2019

Currently we maintain different blocking exceptions across across different components.
Some of these are hardcoded and can't be overridden.

Maintaining different blocking exceptions on different sub-components of code sometimes causes concerns and confusion that we fully whitelist things, when in fact we are just allowing them from one of many different components we have.
In the case linked above, we allowed some things in the tracking protection component, but still blocked many things in our ad-block component.

In truth, EasyList and EasyPrivacy have thousands of blocking rules, and exception rules. You can target blocking and exceptions for very specific things. If you used only blocking rules, you'd quickly see nothing loads at all. It relies on both types of rules.

We have 2 main URL blocking components in Brave currently.

  1. Our Disconnect blocking (which we call tracking protection) https://github.com/brave/tracking-protection. These are host only blocking rules that don't have exceptions, except for our global whitelist (which is forced to be host only).
  2. Our ad-blocking library. https://github.com/brave/ad-block. These can match very specific URLs and types of requests.

Despite their names, both components block ads, and both components block tracking.

It's not ideal currently to have global entire domain exceptions in the Disconnect blocking list because they are over-reaching, they aren't specific enough. It also causes confusion to the community as linked above.

What the Brave code does (before this issue is/was closed), is allow certain exceptions in our Disconnect blocking, but we'd still catch a lot in our ad-blocking. We had to fully allow in our Disconect blocking in order to enable blocking of a subset of URLs. We did that via the whitelist vector of hardcoded hosts in the tracking protection component. We would do Disconnect blocking first, and then if it's not blocked, we would go on to ad-blocking.

Now what is being changed in this issue is to do ad-blocking first. Then follow these rules depending on if there's a match or not:

  • If it's not matched to any blocking rule, then go to the Disconnect blocking.
  • If it is matched to a blocking rule, and no exception rule is matched, go to the Disconnect blocking.
  • If it is matched to a blocking rule, and an explicit exception exists, don't go to the Disconnect blocking.

There is no reason to manage exceptions per section of code, so I think this is a good way to manage exceptions, and we can build on this in the future too to enable certain classes of more strict or less strict options. e.g. block social media login buttons.

About exceptions in EasyList and EasyPrivacy

It's worth noting that EasyList and EasyPrivacy maintain thousands of blocking rules and exception rules. At the time of this writing, 5,559 exceptions and 1,197 exception rules to be exact.
Anything that starts with @@ is an exception rule, you can see them here:
https://easylist-downloads.adblockplus.org/easylist.txt
https://easylist-downloads.adblockplus.org/easyprivacy.txt

We also use uBlock incremental webcompat fix list here:
https://github.com/brave/adblock-lists/blob/master/brave-unbreak.txt

And our own:
https://github.com/brave/adblock-lists/blob/master/brave-unbreak.txt

Summary of the plan

Step 1 (this Issue, completed)

Unify tracking protection and ad-block exception handling in 1 place (via ad-block lib rules since they can be very specific). Filter rules will be moved here brave/adblock-lists#45

Step 2 (In progress)

Add options to easily disable the exception rules, it will be allowed by default when this step lands as it is today for FB login buttons, embedded tweets, and embedded FB posts.
This will allow you to uncheck things, and it will look like:

  • Allow exceptions from the EasyList and EasyPrivacy component for the Disconnect list component
  • Allow Facebook login buttons on third party sites
  • Allow Google login buttons on third party sites
  • Allow embedded tweets
  • Allow Facebook embeds
  • Allow LinkedIn embeds

Step 3 (Not yet started)

Add a UI which allows us to block by default but inform the user in case they want to enable the functionality.

We're moving forward on all 3 steps now, but we'll get there sequentially. Step 1 should land early this week and will close this issue.

Technical note on Step 2 and 3:
To do those options I think a good way will be via an extension to AdBlockPlus filter syntax for tagging filter rules:
$tags=embedded-tweets would tag a rule as being related to the embedded tweets option.
When we do matching, if the option for embedded tweets is off, we'd ignore any filters with the tag embedded-tweets. Clients would specify matching options to use, and AdBlockClient::matches would ignore the things that are specified.


Transparency

Documentation on our progress will be maintained here:
https://github.com/brave/brave-browser/wiki/Web-compatibility-issues-with-tracking-protection


A specific use case:

Before: Allow Facebook domains from the tracking protection component, block some subset from brave/ad-block.

Now: Block all Facebook 3p but allow only a single script.

||facebook.com$third-party
||facebook.net$third-party
@@||connect.facebook.com/*/sdk.js$script
@@||connect.facebook.net/*/sdk.js$script

Test plan

  1. Load this page in Brave <0.63.x vs one in 0.63.x and it should have the same things loaded.
    https://fmarier.github.io/brave-testing/social-widgets.html

  2. Test logins to the following sites work:

  1. Retest these issues about embedded tweets, they should still work.
  1. Facebook embeds should work
    I don't know of a site where there is one but you can test it easily by following these steps:
    Go to facebook.com and click on an account with public tweets like Brave Software.
    Click the pull down on the top of a post and get the embed code. Copy the iframe to a local html file to index.html. Use `python -m SimpleHTTPServer 8080 and then navigate to http://localhost:8080/
@bbondy bbondy self-assigned this Feb 24, 2019
@bbondy bbondy changed the title Remove hardcoded exceptions and unify exception handling Move hardcoded exceptions rules to more specific rules and unify exception handling Feb 25, 2019
@bbondy bbondy added this to the 0.63.x - Nightly milestone Feb 25, 2019
@diracdeltas
Copy link
Member

here's a page with embedded facebook feeds: https://www.british-girlguiding-overseas.org.uk/. according to https://support.mozilla.org/en-US/questions/1185550, it is broken with tracking protection without the adblock exemptions.

@fmarier
Copy link
Member

fmarier commented Feb 25, 2019

Here's another test page full of social embeds: https://fmarier.github.io/brave-testing/social-widgets.html

@fmarier
Copy link
Member

fmarier commented Feb 25, 2019

Note that we also have a pairwise whitelisting which is used in the tracking protection component to group related domains together based on the company groupings that are built into the Disconnect list.

I think that maintaining these groupings as part of our third-party'ness checks could be helpful from a webcompat point of view. I would suggest however that we use a more comprehensive list (see #3194).

@kjozwiak
Copy link
Member

kjozwiak commented Mar 12, 2019

Verification PASSED on macOS 10.14.3 x64 using the following build:

Brave 0.61.50 Chromium: 73.0.3683.67 (Official Build) (64-bit)
Revision a83fd4f3207ae83412d329a9ca1239dd1e068345-refs/branch-heads/3683@{#760}
OS Mac OS X

Used the following embedded iframe and created index.html:

<iframe src="https://www.facebook.com/plugins/post.php?href=https%3A%2F%2Fwww.facebook.com%2FBraveSoftware%2Fposts%2F1289851444504938&width=500" width="500" height="567" style="border:none;overflow:hidden" scrolling="no" frameborder="0" allowTransparency="true" allow="encrypted-media"></iframe>

Launched via http://localhost:8080/ and ensured the following:

Screen Shot 2019-03-11 at 10 33 20 PM

Verification passed on

Brave 0.61.50 Chromium: 73.0.3683.67 (Official Build) (64-bit)
Revision a83fd4f3207ae83412d329a9ca1239dd1e068345-refs/branch-heads/3683@{#760}
OS Windows 7 Service Pack 1 Build 7601.24312

Used test plan from the description.

image

Verification PASSED on Linux Mint 19.1 x64 using the following build:

Brave 0.61.50 Chromium: 73.0.3683.67 (Official Build) (64-bit)
Revision a83fd4f3207ae83412d329a9ca1239dd1e068345-refs/branch-heads/3683@{#760}
OS Linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment