Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove Heroku specific Req headers from being sent to Origin #278

Merged
merged 1 commit into from
Sep 28, 2020
Merged

remove Heroku specific Req headers from being sent to Origin #278

merged 1 commit into from
Sep 28, 2020

Conversation

bulk88
Copy link
Contributor

@bulk88 bulk88 commented Sep 24, 2020

-saves bytes, and avoids triggering IDS/WAF alarms since browser finger
printing will prove these headers are unnatural and on SSL must be a MITM
attack

@coveralls
Copy link

coveralls commented Sep 24, 2020

Coverage Status

Coverage remained the same at 100.0% when pulling 7271e29 on bulk88:no_heroku_headers_to_origin into 3bab870 on Rob--W:master.

Copy link
Owner

@Rob--W Rob--W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide sources or proof that the presence of these request headers are causing issues?

I'm not against cleaning up some headers but as it may break client expectations, I need to see clear advantages to accepting this PR.

server.js Outdated
@@ -33,6 +33,13 @@ cors_proxy.createServer({
'x-heroku-queue-depth',
'x-heroku-dynos-in-use',
'x-request-start',
'x-request-id',
'x-forwarded-for',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objection against the removal of the listed headers, except maybe for x-forwarded-* because it is a standard non-Heroku specific header, and people may be relying on its existence.

@bulk88
Copy link
Contributor Author

bulk88 commented Sep 26, 2020

Specifically I am dealing with Cloudflare but you dont need to know that for this PR.

No cyber security company will publish their rules for preventing CSRF/PCI/DDOS attacks. A CORS proxy is fundamentally is a security breaking tool.

I've never seen a web server that sends "403 forbidden" if the Origin header is set in a request GET, for what was supposed to be a same-origin XHR or <form> POST or <img> element, but sooner or later in the future it will be best practices to, for a web server to check that Referrer or Origin must be missing, string "null", or same-origin URLs or return 403/Forbidden and never let the XSS request reach the DB/backend. I in this PR would like to strip the heroku-specific headers atleast. I think Origin/Referrer should stay the same as a last measure debug/defense to the orgin server's admin logs and should reveal the domain that is doing an anti-CORS CORS fetch. Heroku specific or all unknown headers could be triggered to block the request by generic "layer 7" security software, but values of "known always-present browser header fields" will be passed through to the backend app specific code unmolested regardless of the header value's exact byte string contents, example checking if Origin/Referrer has same TLD+1 as API server's TLD+1.

Here is an example of extra headers current on the public instance.

await fetch("https://cors-anywhere.herokuapp.com/http://scooterlabs.com/echo?argname=argval");

results in

Simple webservice echo test: make a request to this endpoint to return the HTTP request parameters and headers. Results available in plain text, JSON, or XML formats. See http://www.cantoni.org/2012/01/08/simple-webservice-echo-test for more details, or https://github.com/bcantoni/echotest for source code.

Array
(
    [method] => GET
    [headers] => Array
        (
            [Host] => scooterlabs.com
            [Connection] => close
            [User-Agent] => Mozilla/5.0 (Windows NT 6.1; rv:80.0) Gecko/20100101 Firefox/80.0
            [Accept] => */*
            [Accept-Language] => en-US,en;q=0.5
            [Accept-Encoding] => gzip, deflate, br
            [Referer] => http://example.com/
            [Origin] => http://example.com
            [X-Request-Id] => 93366f3c-a5b8-46fa-9249-bec762fce9da
            [X-Forwarded-For] => 74.64.17.23
            [X-Forwarded-Proto] => https
            [X-Forwarded-Port] => 443
            [Via] => 1.1 vegur
            [Connect-Time] => 0
            [Total-Route-Time] => 0
        )

    [request] => Array
        (
            [argname] => argval
        )

    [client_ip] => 74.64.17.23
    [time_utc] => 2020-09-26T17:34:10+0000
    [info] => Echo service from Scooterlabs (http://www.scooterlabs.com)
)

It is funny that that HTTP debug tool "[client_ip] => 74.64.17.23" published my Charter/Spectrum IP address from X-Forwarded-For header even tho cors-anywhere.herokuapp.com is an Amazon IP address on outgoing requests.

For comparison, my Firefox sent

GET /http://scooterlabs.com/echo?argname=argval HTTP/1.1
Host: cors-anywhere.herokuapp.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:80.0) Gecko/20100101 Firefox/80.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: http://example.com/
Origin: http://example.com
Connection: keep-alive

So all webservers see those extra headers. I would agree on the public instance X-Forwarded-* should be left inplace in case an admin is trying to block abuse from cors-anywhere.herokuapp.com but X-Request-Id, Via, Connect-Time, Total-Route-Time need to be removed.

https://blog.cloudflare.com/moving-from-recaptcha-to-hcaptcha/

The second most popular is our IP Firewall. This is similar to the Firewall and Bot Rules, but much less granular at the IP, ASN, or country level. The majority of CAPTCHAs for this category are done for rules written at the ASN or country level. Presumably our customers find some level of distrust with a particular ASN (for example, why would there be supposed user traffic coming from a cloud infrastructure provider?) or are being attacked from specific countries.

Since I bet 100% of cors-anywhere instances are running on "cloud infrastructure provider" ASNs/IP addrs, and not on residential or mobile ISP ASNs or not-ISP corporate ASNs, cors-anywhere is very likely to be caught by security software as high risk user/IP address. Minimizing the amount of data that exists to the origin webserver to trigger a block rule is the best idea.

On a private CORS proxy I wrote, I do set Origin/Referrer inside the proxy, to same domain as destination URL

var url = new URL(desturl);
h['Origin'] = url.origin;
h['Referer'] = url.origin;
h['Host'] = url.host;

so it is impossible for the destination webserver to know a CORS proxy connected to it, but that is too evil to put on your public instance as a default.

Cloudflare will always will catch ANY cors-anywhere instance because of TLS Client Hello fingerprinting,. They even catch same machine/same IP wget/curl with byte-wise identical HTTP headers to Chrome. Node/Curl/Wget always requests the entire multi KB long X509 TLS certificate in their Client Hello, that triggers Cloudflare that "its a bot, real browsers save X509 certificates forever in cache and NEVER ask for them again until cert expires", so cors-anywhere (Node.js process SSL stack) gets shown the CAPTCHA screen instantly. Anyways, heroku specific headers shouldn't be going to final webserver.

@Rob--W
Copy link
Owner

Rob--W commented Sep 27, 2020

So all webservers see those extra headers. I would agree on the public instance X-Forwarded-* should be left inplace in case an admin is trying to block abuse from cors-anywhere.herokuapp.com but X-Request-Id, Via, Connect-Time, Total-Route-Time need to be removed.

Agreed. If you update the PR to only remove Heroku-specific headers instead of the more generic xfwd headers, then I'm going to merge it.

A CORS proxy is fundamentally is a security breaking tool.

It's not. This project only offers access to resources that are available without authentication. A non-web app (e.g. a Python script) could request the same data without a proxy. The only case where a CORS proxy could introduce a new security risk is when it is placed in a network where other nodes in the network have a misguided belief that every request is confidential (e.g. an open wiki within an intranet). But this is not an issue with this specific project, as it applies to any open proxy.

Cloudflare will always will catch ANY cors-anywhere instance because of TLS Client Hello fingerprinting,. They even catch same machine/same IP wget/curl with byte-wise identical HTTP headers to Chrome. Node/Curl/Wget always requests the entire multi KB long X509 TLS certificate in their Client Hello, that triggers Cloudflare that "its a bot, real browsers save X509 certificates forever in cache and NEVER ask for them again until cert expires", so cors-anywhere (Node.js process SSL stack) gets shown the CAPTCHA screen instantly.

Interesting. Note that even if this project somehow offered perfect cloaking, then it would not be able to pass the challenge because this proxy intentionally strips cookies in both directions.

-saves bytes, and avoids triggering IDS/WAF alarms since browser finger
 printing will prove these headers are unnatural and on SSL must be a MITM
 attack

-leave x-forwarded-* intact since they can be used to block CORS proxy
 abuse if the not-CORS origin webmaster really has to block the proxy
 and they are not unique to Heroku platform
@bulk88
Copy link
Contributor Author

bulk88 commented Sep 28, 2020

Agreed. If you update the PR to only remove Heroku-specific headers instead of the more generic xfwd headers, then I'm going to merge it.

Repushed, I left a comment that the headers exist for private instances and its upto private instance admin to uncomment them if they want.

Interesting. Note that even if this project somehow offered perfect cloaking, then it would not be able to pass the challenge because this proxy intentionally strips cookies in both directions.

Correct. Cors-anywhere disables 2 ways cookies. Without the __cfruid/__cfduid/cf_clearance cookies a CF captcha cant be solved. And obviously the client side UI JS browser side must detect the 403 forbidden captcha, rewrite URLs in the HTML to point to the heroku proxy domain, draw the HTML to the user, user answers the captcha, client UI continues in non-interactive mode doing AJAX calls to the cors proxy, but with all future requests passing in the 3 clouldflare cookies.

In my private instance I turned on 2 way cookies along with cookie domain rewriting, but it is incredibly insecure unless the proxy is permanently locked to fetching from a single non-CORS origin content domain, since all content domain servers would see the cookies of all other content domain servers ever visited by a particular user/browser instance and cors proxy instance combo,

This is fantasy design by now, prefixing the content domain to HTTP only cookies names when passing response to client (content server->client), and stripping the content domain prefix off of the cookie name on client->content server and dropping wrong content server domain cookies from req to content server. But design breaks down, if cors-anywhere is used a navigation/browsing (no Origin Header) proxy (yet another hack), not-HTTP-only cookies must have their names intact in case JS wants to manipulate them through document.cookies and cant have proprietary prefixes on the cookie names.

Copy link
Owner

@Rob--W Rob--W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@Rob--W Rob--W merged commit c8a2091 into Rob--W:master Sep 28, 2020
@Rob--W
Copy link
Owner

Rob--W commented Sep 28, 2020

Deployed.

@bulk88 bulk88 deleted the no_heroku_headers_to_origin branch September 29, 2020 06:43
Rob--W added a commit that referenced this pull request Mar 17, 2021
- Omit unnecessary `Access-Control-Max-Age` (#277)
- Remove more Heroku-specific headers (#278)
- Add `handleInitialRequest` option (#335)
- Document access requirements for public demo (#301)
- Update gTLD list
DiegoFleitas added a commit to DiegoFleitas/cors-anywhere that referenced this pull request May 26, 2022
* Extend supported Node.js from <=9 to <=14

* test-memory: destroy response to free socket

Starting from Node 12, the test started to fail because of
intermittent socket errors, such as ECONNRESET and "socket hang up".

Destroying the response before triggering a new request resolves it.

* Explicit early out for invalid URLs

* Version 0.4.2

- Reject invalid URLs earlier instead of trying to continue with the
  request (and failing anyway).
- Explicitly close the response when an error occurs for Node 13+.
- Update tests to cover up to Node 14 (was up to 9).

* Update test expectation for Node 12.x

* test-memory: fix test by passing --max-http-header-size

The test broke because Node lowered the maximum header size to defend
against large headers ( CVE-2018-12121 ).

In the test, we do actually want to pass large headers, because all
processing in CORS Anywhere is based on headers (the request body would
just be forwarded to the destination server).

The test failed intermittently with ECONNRESET or "socket hang up"
because the server (under test) would close the socket upon receiving
a request with too large request headers.

* Pass --max-http-header-size in supported versions only

* Reject invalid redirects

Fixes Rob--W#234.

* Version 0.4.3

- Reject invalid URLs in redirects (fixes regression from 0.4.2) (Rob--W#234)
- Update memory tests for recent Node versions.

* only send Access-Control-Max-Age if preflight request, not POST/GET

-Access-Control-Max-Age header only has meaning for preflights, not
 POST or GET, saves wire bytes by excluding it from POST/GET/etc,
 and future problems if ACMA on a content HTTP method is given
 meaning by W3C or a browser vendor

-fix expectNoHeader() test helper func ,this was a no-op before by
 accident and would NEVER fail,
 supertest/test.js:Test.prototype._assertFunction requires an retval of
 class type Error if test fail, not a string or a number or Object

* remove Heroku specific Req headers from being sent to Origin

-saves bytes, and avoids triggering IDS/WAF alarms since browser finger
 printing will prove these headers are unnatural and on SSL must be a MITM
 attack

-leave x-forwarded-* intact since they can be used to block CORS proxy
 abuse if the not-CORS origin webmaster really has to block the proxy
 and they are not unique to Heroku platform

* Remove obsolete values from server.js's removeHeaders

`X-Heroku-Dynos-In-Use`, `X-Heroku-Queue-Depth` and
`X-Heroku-Queue-Wait-Time` have already been dropped in 2013:
https://devcenter.heroku.com/changelog-items/218

* Add handleInitialRequest option to support Rob--W#301

The custom filtering logic is not part of the public repository, to
keep the project clean.

* Expand handleInitialRequest documentation Rob--W#335

* Add note about availability of public demo server

Referencing Rob--W#301

* Update gTLD list

* Version 0.4.4

- Omit unnecessary `Access-Control-Max-Age` (Rob--W#277)
- Remove more Heroku-specific headers (Rob--W#278)
- Add `handleInitialRequest` option (Rob--W#335)
- Document access requirements for public demo (Rob--W#301)
- Update gTLD list

* Support NODE_TLS_REJECT_UNAUTHORIZED=0 to ignore client errors Rob--W#341

Apparently `NODE_TLS_REJECT_UNAUTHORIZED` is only effective if
`rejectUnauthorized` was not overridden by the code:
https://github.com/nodejs/node/blob/85e6089c4db4da23dd88358fe0a12edefcd411f2/lib/_tls_wrap.js#L1583-L1591

But the underlying library does override it:
https://github.com/http-party/node-http-proxy/blob/v1.11.1/lib/http-proxy/common.js#L53-L55

Fix this by overriding the option via the library's "secure" option.

* Fix test expectation for old node

* Migrate travis-ci from .org to .com

* Add Node 15.x to Travis

* Show "400 Missing slash" when needed Rob--W#238

* Add LICENSE file based on README.md Rob--W#297

* Fix typo

Co-authored-by: Rob Wu <rob@robwu.nl>
Co-authored-by: bulk88 <bulk88@hotmail.com>
Co-authored-by: Noodles <20896419+alex-lushiku@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants