Skip to content

Latest commit

 

History

History
111 lines (94 loc) · 12.9 KB

relocating-logic-from-cdn.md

File metadata and controls

111 lines (94 loc) · 12.9 KB

Appendix: Proposal for relocating logic from the CDN layer to other parts of the stack

Things that could be moved to router:

Things that could be moved to WAF:

  • Silently ignore certain requests34
    • This was the outcome of an incident report - details cannot be provided here, as this is a public repo
  • Serving an HTTP 404 response with a hardcoded template5 if the request URL matches /autodiscover/autodiscover.xml6
  • Requiring HTTP Basic auth on integration78 (unless the user's IP is in the allowlist9)
    • If we handle this in the WAF then we will need to be careful around caching. We might be able to do something like setting Vary: Authorization in the response from the WAF, and additionally set Vary: Fastly-Client-IP if and only if the Authorization header is missing
    • Will need to spike this approach to see if it's actually easier than handling at edge

Things that we could probably remove:

  • Feature flag for showing recommended related links for Whitehall content10
    • This was added as a safety mechanism for the introduction of related links, in case something really inappropriate was found. There are ways to manually override the links now, and we've never used the feature flag, so we can probably remove it now.
    • If we remove this functionality then we will also need to update the docs.
    • Regarding feature flags as a general concept:
      • On the old platform it was necessary to implement these flags in the CDN layer to prevent the need to deploy a new release every time we wanted to enable/disable a feature. A header is set on the backend request to indicate to the application whether the feature is enabled or disabled.
      • In the replatformed world, this becomes a lot easier: we can implement feature flags through environment variables (as opposed to headers), and then enabling or disabling the feature becomes a matter of updating the environment variables in govuk-helm-charts and waiting a couple of minutes for Argo to pick up the changes.
  • Enforcing the use of TLS1112
    • Origin already performs an HTTP 301 redirect to the TLS version of the site. Browsers and CDNs should cache this response.

Things that need to remain in our CDN (but become easier to implement/maintain if we later migrate to Compute@Edge):

  • IP denylisting13 (this must happen at the CDN layer, where caching takes place)
    • This functionality is currently unused (the dictionary that the denylist is read from is empty), but it exists in case we ever need to quickly block IP addresses (for example, during an incident).
  • JA3 denylisting1415
    • The JA3 fingerprint is computed from the TLS handshake, meaning it has to be computed at the node to which the client's TLS connection is made (i.e. the CDN)
    • We could compute the JA3 fingerprint at the CDN layer and pass it via a header to the WAF in which the actual blocking takes place, but this would have implications on caching and so probably isn't feasible
  • Require authentication for Fastly PURGE requests16
    • This doesn't need parity on Cloudfront
  • Sorting query string params17 and removing Google Analytics campaign params18 to improve cache hit rate
  • Stripping query string params only for the homepage and /alerts19
    • This appears to be a DDoS prevention measure(?) - should we expand this protection to other routes?
  • Automatic failover to static S3/GCS mirror if origin is unhealthy or returns an HTTP 5xx (only in staging and production - in integration we want to be able to see errors as they happen)20
  • Stripping the Accept-Encoding header if the content is already compressed21
  • Controlling cache behaviour based on the Cache-Control header returned by origin22
    • Manually set Fastly-Cachetype to PRIVATE if Cache-Control: Private23
    • Explicit pass if Cache-Control: max-age=024
    • Explicitly pass if Cache-Control: no-(store|cache)25
    • It is unclear which (if any) of these remain necessary if we decide to move to Compute@Edge (it's also unclear why Fastly doesn't respect them automatically 🤷‍♂️)
  • Setting a request id header to allow requests to be traced through the stack26
    • It's important to set this at the earliest opportunity, which is when we first receive the request (at edge)
  • Mapping from headers to cookies and back
    • It is considered a best practice to strip cookies before forwarding the request to origin. For this reason our VCL contains logic to map from headers to cookies and back, to implement the following features:
      • GOV.UK accounts272829
        • This is described in more detail in RFC-134, and the discussion on the associated PR
        • Code exists in our VCL to map between a cookie named __Host-govuk_account_session in user requests/responses, and the GOVUK-Account-Session and GOVUK-Account-End-Session headers in backend requests/responses, and to control the cache behaviour of these requests/responses
      • A/B testing3031
        • Code exists in our VCL to select a variant for each active test, pass the chosen variant to origin, and store the chosen variant in a cookie so that the same variant will be chosen on the next request
    • This functionality needs to remain in the CDN layer, but becomes much easier to implement in Compute@Edge (details of this might follow in a future RFC).

Things that need to stay in VCL for now, but will become unnecessary if we later move to Compute@Edge:

  • Explicitly marking HTTP 307 responses from origin as cacheable32
    • Fastly VCL is built on an old version of Varnish which didn't do this by default; if we migrate to Compute@Edge then we shouldn't need this anymore
  • Enabling Brotli compression3334
    • From the description of the commit that introduced this change, this appears to be a workaround for a limitation in VCL - if that's not the case, we can port this over to Compute@Edge

Known issues with our current config that could be addressed more easily if we move to Compute@Edge:

  • Currently if origin returns an HTTP 500, and we failover to the S3 mirror, but the requested path is not present in the mirror, the user receives an HTTP 403 and a very ugly XML-based error page
    • This is expected behaviour: S3 returns a 403 if the file is missing and the access key that was used to make the request does not have the s3:ListBucket permission
    • The fix is to intercept HTTP 403 responses only from the S3 backend, and replace them with a hardcoded error page - much easier to implement in Compute@Edge than in VCL

Footnotes

  1. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L231-L233

  2. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L606-L612

  3. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L223

  4. https://github.com/alphagov/govuk-cdn-config-secrets/blob/536de2171d17297c08a0a328df53a6b65002e2c4/fastly/fastly.yaml#L30-L39

  5. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L579-L603

  6. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L226-L228

  7. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L202-L207

  8. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L614-L620

  9. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L154-L165

  10. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L218-L220

  11. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L561-L568

  12. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L189-L192

  13. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L178-L180

  14. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L195-L199

  15. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L171

  16. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L236

  17. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L239

  18. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L256-L264

  19. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L266-L326

  20. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L211-L215

  21. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L444-L455

  22. https://github.com/alphagov/govuk-cdn-config/commit/03cb1fc5794658b89ed9f80ab5ca3c0b98a7afe7

  23. https://github.com/alphagov/govuk-cdn-config/commit/54bf796f7c7543a893dbf14a8ca4fa1eae3253a1

  24. https://github.com/alphagov/govuk-cdn-config/commit/fa56132e49d41595ba1681467adb828694cf0086

  25. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L254

  26. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L504-L522

  27. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L350-L361

  28. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L488-L492

  29. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L524-L555

  30. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/_multivariate_tests.vcl.erb

  31. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L459-L461

  32. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L329-L333

  33. https://github.com/alphagov/govuk-cdn-config/blob/55e587b238338caea1c7187c1f5d70cac8e5b104/vcl_templates/www.vcl.erb#L388-L402