Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow HEAD requests to generate a form key #17321

Merged

Conversation

artfulrobot
Copy link
Contributor

Overview

The motivation of this is to avoid generating crashes (500 errors) when bots check links.

Many bots access CiviCRM - specifically CiviMail - URLs to check what they are/where they redirect to/that they exist, for indexing, for anti-spam, for good or for bad. Several of these use HEAD requests instead of GET requests since they do not care for the whole body of the document, just the headers.

Civi's get form key code would generate a key, if missing, for GET requests, but not for HEAD requests. HEAD requests would then be treated like invalid POST requests, generating the error, filling the logs, and triggering whatever other crash/error notification systems the diligent service provider uses.

Before

  • HEAD requests cause "Server Error" with all that entails.
  • External tools to verify URLs think the site is down/link is invalid which may count against us

After

  • HEAD requests are treated as GET requests for the purposes of identifying a key and therefore do not generate server errors.
  • External tools to verify URLs receive a normal 200 status response.

Technical Details

Side note: Civi is very fond of using $_REQUEST which is a bit of a sloppy shortcut, since that variable can be reconfigured to prioritise different data (GET, POST, COOKIE) and may differ between hosting OS distros. This PR introduces a stricter and more specific method: we prefer POST data to GET data; we don't accept a qfKey in COOKIEs.

@civibot
Copy link

civibot bot commented May 14, 2020

(Standard links)

@seamuslee001
Copy link
Contributor

jenkins re test this please

@mlutfy
Copy link
Member

mlutfy commented May 20, 2020

It's slightly annoying that this causes the HEAD request to generate the page, so it will end up being generated twice (because pages are rarely cached), but I think it's the right thing to do.

Technically, I think we should make sure that the HEAD response does not return the body. The RFC says "Must not", but my understanding is that the Internet will not explode if we do.

For reference, the reporterror extension responds with a "Bad Request (400)":
https://lab.civicrm.org/extensions/reporterror/-/commit/0ef13082ee0c0c064b719e386dd9d3f0819bd903

but after more reading, I don't think it's the right thing to do. Although since this is used mostly by RSS readers, bots and URL fetching services (Facebook/Twitter), I don't think it's a big issue, but it's nice to fix it in core.

Looking at Drupal, they seem to respond with a "200" code, and generating the page twice (but in Drupal, the page is usually cached).

@artfulrobot
Copy link
Contributor Author

Good points. I suppose core could check for HEAD requests at

https://github.com/civicrm/civicrm-core/blob/5.25.0/CRM/Core/Invoke.php#L77

like:

<?php
if ($request->getMethod() === 'HEAD') {
  $response->sendHeaders();
  Civi\Utils\System::civiExit();
}

@mlutfy
Copy link
Member

mlutfy commented May 20, 2020

I think that would require making sure that the Content-Length header has the same size, or is absent (which is also OK, if I understand correctly).

I'm not opposed to merging this as-is for now. HEAD requests are a small fraction of webserver traffic.

Stats from today on civicrm.org:

$ grep HEAD /var/log/nginx/access.log | wc -l
292
$ grep POST /var/log/nginx/access.log | wc -l
2632
$ grep GET /var/log/nginx/access.log | wc -l
41723

although you might say that's a different problem. We're not very popular on Twitter/Facebook. Most of the HEAD requests are from Slackbot and RSS feeds.

@mattwire
Copy link
Contributor

@artfulrobot Test failures relate

@mattwire mattwire added the needs-work-not-review-ready Submitter required to take action, will be closed after 1-2 weeks in this state label May 26, 2020
@artfulrobot artfulrobot force-pushed the artfulrobot-identify-request-key branch from 4e38860 to 5d76f6d Compare May 26, 2020 15:56
@artfulrobot
Copy link
Contributor Author

Funny - it's related to #17324 that I was working on separately!

@artfulrobot artfulrobot force-pushed the artfulrobot-identify-request-key branch from 5d76f6d to cc59f8f Compare May 26, 2020 16:24
@mattwire mattwire added ready for review and removed needs-work-not-review-ready Submitter required to take action, will be closed after 1-2 weeks in this state labels Jun 6, 2020
@mlutfy
Copy link
Member

mlutfy commented Jun 11, 2020

This seems good to me, and I would merge if there are no objections.

@mlutfy mlutfy added merge ready PR will be merged after a few days if there are no objections and removed ready for review labels Jun 11, 2020
@mattwire
Copy link
Contributor

Agreed - merging

@mattwire mattwire merged commit 4c00b9b into civicrm:master Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
concept approved master merge ready PR will be merged after a few days if there are no objections
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants