Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gatsby): trailing slash redirect in commands/serve #21203

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions packages/gatsby/src/commands/serve.js
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ const telemetry = require(`gatsby-telemetry`)
const detectPortInUseAndPrompt = require(`../utils/detect-port-in-use-and-prompt`)
const getConfigFile = require(`../bootstrap/get-config-file`)
const preferDefault = require(`../bootstrap/prefer-default`)
const fileExtensionRegEx = /.+\..+$/g

onExit(() => {
telemetry.trackCli(`SERVE_STOP`)
Expand All @@ -40,6 +41,29 @@ const readMatchPaths = async program => {
return JSON.parse(rawJSON)
}

/**
* Adds trailing slash to request.url, in order to deceive static middleware
* into getting a directory request, thus preventing redirect.
* This is an express middleware to be used before static middleware
* @param {*} req
* @param {*} res
* @param {*} next
*/
const directoryRedirectPrevention = (req, res, next) => {
// url is home page
if (req.url === `/`) return next()
// url has trailing slash, no need to intervene
if (req.url.slice(-1) === `/`) return next()
Comment on lines +53 to +56
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second check subsumes the first because 'x'.slice(-1) === 'x' :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, I wanted to differentiate the two cases, easier to understand.
Falls into "write code for humans" practice. The fact that you noticed, proves my point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're saying but I still think I'd prefer to drop the redundant check and updating the comment to reflect the two cases being caught here.

// url is a file request, no need for trailing slash
if (fileExtensionRegEx.test(req.url)) return next()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to overshoot a little bit. This is basically the .includes() check while trying to assert that there is a non-dot before and after the dot. I don't have enough context but is there a particular reason for this?

Also, I think the $ in the regex serves no purpose? (I don't think there's a difference with and without it). If you're set on keeping it, I think /.\../ is the same regex. Additionally, this assumes folder parts can not have a dot (that may be legit, depending on the source of urls, but is certainly not generic).


// turning /slug into /slug/
req.originalUrl += `/` // https://expressjs.com/en/api.html#req.originalUrl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first example in this link has:

// GET /search?q=something
console.dir(req.originalUrl)
// => '/search?q=something'

So potentially this line would be doing originalUrl = '/search?q=something' + '/'. That's probably not what you wanted?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, url's with query args is a use case I haven't tackled yet. I'm not fully familiar with gatsby's handle of query args. Your input here is required.
Do we turn /search?q=something into /search/?q=something ?

Copy link
Contributor

@pvdz pvdz Feb 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I have no idea :) I mainly jumped on the syntactical issues with this part of the change set.

Maybe @pieh knows

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this piece of code in production runtime that adjust url to use canonical page path for initial page load (so this will be connected with to gatsby serve behaviour in some way) -

// Explicitly call navigate if the canonical path (window.pagePath)
// is different to the browser path (window.location.pathname). But
// only if NONE of the following conditions hold:
//
// - The url matches a client side route (page.matchPath)
// - it's a 404 page
// - it's the offline plugin shell (/offline-plugin-app-shell-fallback/)
if (
pagePath &&
__BASE_PATH__ + pagePath !== browserLoc.pathname &&
!(
loader.findMatchPath(stripPrefix(browserLoc.pathname, __BASE_PATH__)) ||
pagePath === `/404.html` ||
pagePath.match(/^\/404\/?$/) ||
pagePath.match(/^\/offline-plugin-app-shell-fallback\/?$/)
)
) {
navigate(__BASE_PATH__ + pagePath + browserLoc.search + browserLoc.hash, {
replace: true,
})
}

Just note that "canonical path" on its own doesn't specify if it has trailing slash or not - it just mean path that was used with createPage action call (Gatsby doesn't normalize those right now - at least for pages created in user code in createPages hook). Gatsby does however use trailing slashes for pages created from src/pages directory (so it does have opinion here).

I'm not sure if this make any sense or if this is answer any question here.

But for manipulating URLs I recommend using builtin Node's url module - particularly:

  • url.parse - to created structured url object
  • then modify pathname part to add trailing slash
  • url.format - for converting object back to string

req.url += `/`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which made me look into .url, so. Same link has

req.url is not a native Express property, it is inherited from Node’s http module.

https://nodejs.org/api/http.html#http_message_url

When request.url is '/status?name=ryan'

So it looks like .url has the same problem. Is there a reason why this is not a problem?


return next()
}

const matchPathRouter = (matchPaths, options) => (req, res, next) => {
const { url } = req
if (req.accepts(`html`)) {
Expand Down Expand Up @@ -85,6 +109,7 @@ module.exports = async program => {
app.use(telemetry.expressMiddleware(`SERVE`))

router.use(compression())
router.use(directoryRedirectPrevention)
router.use(express.static(`public`))
const matchPaths = await readMatchPaths(program)
router.use(matchPathRouter(matchPaths, { root }))
Expand Down