Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gatsby-plugin-offline] [gatsby/cache-dir] Fix various offline and caching issues #7355

Merged
merged 22 commits into from
Aug 21, 2018
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 18 additions & 13 deletions packages/gatsby-plugin-offline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,34 +29,39 @@ and AppCache setup by changing these options so tread carefully.

```javascript
const options = {
staticFileGlobs: [
`${rootDir}/**/*.{woff2}`,
`${rootDir}/commons-*js`,
`${rootDir}/app-*js`,
staticFileGlobs: files.concat([
`${rootDir}/index.html`,
`${rootDir}/manifest.json`,
`${rootDir}/manifest.webmanifest`,
`${rootDir}/offline-plugin-app-shell-fallback/index.html`,
],
...criticalFilePaths,
]),
stripPrefix: rootDir,
// If `pathPrefix` is configured by user, we should replace
// the `public` prefix with `pathPrefix`.
// See more at:
// https://github.com/GoogleChrome/sw-precache#replaceprefix-string
replacePrefix: args.pathPrefix || ``,
navigateFallback: `/offline-plugin-app-shell-fallback/index.html`,
// Only match URLs without extensions.
// Only match URLs without extensions or the query `no-cache=1`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's this "no-cache=1" coming from?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I see I think. You're using it as a way to to ensure the sw does catch the reload?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for when Gatsby detects a page not found after loading from the offline shell - usually it's correct, but in some cases the website could've been updated, or it's a static file which isn't handled by Gatsby (e.g. Netlify CMS's /admin/ page). We need to check if these files actually exist, which means they need to be loaded from the server directly, not from the cache - that's the purpose of this query, to prevent the cache handling them.

// So example.com/about/ will pass but
// example.com/about/?no-cache=1 and
// example.com/cheeseburger.jpg will not.
// We only want the service worker to handle our "clean"
// URLs and not any files hosted on the site.
navigateFallbackWhitelist: [/^.*(?!\.\w?$)/],
//
// Regex based on http://stackoverflow.com/a/18017805
navigateFallbackWhitelist: [/^.*([^.]{5}|.html)(?<!(\?|&)no-cache=1)$/],
cacheId: `gatsby-plugin-offline`,
// Do cache bust JS URLs until can figure out how to make Webpack's
// URLs truely content-addressed.
dontCacheBustUrlsMatching: /(.\w{8}.woff2)/, //|-\w{20}.js)/,
// Don't cache-bust JS files and anything in the static directory
dontCacheBustUrlsMatching: /(.*js$|\/static\/)/,
runtimeCaching: [
{
// Add runtime caching of images.
urlPattern: /\.(?:png|jpg|jpeg|webp|svg|gif|tiff)$/,
// Add runtime caching of various page resources.
urlPattern: /\.(?:png|jpg|jpeg|webp|svg|gif|tiff|js|woff|woff2|json|css)$/,
Copy link
Contributor

@KyleAMathews KyleAMathews Aug 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, we weren't caching json files (page data) 🤦‍♂️

handler: `fastest`,
},
],
skipWaiting: false,
skipWaiting: true,
}
```
1 change: 1 addition & 0 deletions packages/gatsby-plugin-offline/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
"@babel/runtime": "7.0.0-beta.52",
"cheerio": "^1.0.0-rc.2",
"lodash": "^4.17.10",
"replace-in-file": "^3.4.2",
"sw-precache": "^5.2.1"
},
"devDependencies": {
Expand Down
2 changes: 1 addition & 1 deletion packages/gatsby-plugin-offline/src/app-shell.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import React from "react"

class AppShell extends React.Component {
render() {
return <div />
return <span />
}
}

Expand Down
31 changes: 17 additions & 14 deletions packages/gatsby-plugin-offline/src/gatsby-browser.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ exports.onPrefetchPathname = ({ pathname, getResourcesForPathname }) => {
if (swNotInstalled && `serviceWorker` in navigator) {
pathnameResources.push(
new Promise(resolve => {
getResourcesForPathname(pathname, resources => {
getResourcesForPathname(pathname).then(resources => {
resolve(resources)
})
})
Expand All @@ -22,24 +22,27 @@ exports.onServiceWorkerInstalled = () => {
swNotInstalled = false

// grab nodes from head of document
const nodes = document.querySelectorAll(
`head > script[src], head > link[as=script]`
)
const nodes = document.querySelectorAll(`
head > script[src],
head > link[as=script],
head > link[rel=stylesheet],
head > style[data-href]
`)

// get all script URLs
const scripts = [].slice
// get all resource URLs
const resources = [].slice
.call(nodes)
.map(node => (node.src ? node.src : node.href))
.map(node => node.src || node.href || node.getAttribute(`data-href`))

for (const resource of resources) {
fetch(resource)
}

// loop over all resources and fetch the page component and JSON
// thereby storing it in SW cache
Promise.all(pathnameResources).then(pageResources => {
pageResources.forEach(pageResource => {
const [script] = scripts.filter(s =>
s.includes(pageResource.page.componentChunkName)
)
fetch(pageResource.page.jsonURL)
fetch(script)
})
for (const pageResource of pageResources) {
if (pageResource) fetch(pageResource.page.jsonURL)
}
})
}
38 changes: 30 additions & 8 deletions packages/gatsby-plugin-offline/src/gatsby-node.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ const precache = require(`sw-precache`)
const path = require(`path`)
const slash = require(`slash`)
const _ = require(`lodash`)
const replace = require(`replace-in-file`)

const getResourcesFromHTML = require(`./get-resources-from-html`)

Expand Down Expand Up @@ -46,8 +47,14 @@ exports.onPostBuild = (args, pluginOptions) => {
rootDir
)

const criticalFilePaths = getResourcesFromHTML(
`${process.cwd()}/${rootDir}/index.html`
const criticalFilePaths = _.uniq(
_.concat(
getResourcesFromHTML(`${process.cwd()}/${rootDir}/index.html`),
getResourcesFromHTML(`${process.cwd()}/${rootDir}/404.html`),
getResourcesFromHTML(
`${process.cwd()}/${rootDir}/offline-plugin-app-shell-fallback/index.html`
)
)
)

const options = {
Expand All @@ -65,21 +72,22 @@ exports.onPostBuild = (args, pluginOptions) => {
// https://github.com/GoogleChrome/sw-precache#replaceprefix-string
replacePrefix: args.pathPrefix || ``,
navigateFallback: `/offline-plugin-app-shell-fallback/index.html`,
// Only match URLs without extensions.
// Only match URLs without extensions or the query `no-cache=1`.
// So example.com/about/ will pass but
// example.com/about/?no-cache=1 and
// example.com/cheeseburger.jpg will not.
// We only want the service worker to handle our "clean"
// URLs and not any files hosted on the site.
//
// Regex from http://stackoverflow.com/a/18017805
navigateFallbackWhitelist: [/^.*([^.]{5}|.html)$/],
// Regex based on http://stackoverflow.com/a/18017805
navigateFallbackWhitelist: [/^.*([^.]{5}|.html)(?<!(\?|&)no-cache=1)$/],
cacheId: `gatsby-plugin-offline`,
// Don't cache-bust JS files and anything in the static directory
dontCacheBustUrlsMatching: /(.*js$|\/static\/)/,
runtimeCaching: [
{
// Add runtime caching of images.
urlPattern: /\.(?:png|jpg|jpeg|webp|svg|gif|tiff|js|woff|woff2)$/,
// Add runtime caching of various page resources.
urlPattern: /\.(?:png|jpg|jpeg|webp|svg|gif|tiff|js|woff|woff2|json|css)$/,
handler: `fastest`,
},
],
Expand All @@ -88,5 +96,19 @@ exports.onPostBuild = (args, pluginOptions) => {

const combinedOptions = _.defaults(pluginOptions, options)

return precache.write(`public/sw.js`, combinedOptions)
return precache.write(`public/sw.js`, combinedOptions).then(() =>
// Patch sw.js to include search queries when matching URLs against navigateFallbackWhitelist
replace({
files: `public/sw.js`,
from: `path = (new URL(absoluteUrlString)).pathname`,
to: `url = new URL(absoluteUrlString), path = url.pathname + url.search`,
}).then(changes => {
// Check that the patch has been applied correctly
if (changes.length !== 1)
throw new Error(
`Patching sw.js failed - sw-precache has probably been modified upstream.\n` +
`Please report this issue at https://github.com/gatsbyjs/gatsby/issues`
)
})
)
}
47 changes: 33 additions & 14 deletions packages/gatsby-plugin-offline/src/get-resources-from-html.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,30 +5,49 @@ const _ = require(`lodash`)

module.exports = htmlPath => {
// load index.html to pull scripts/links necessary for proper offline reload
const html = fs.readFileSync(path.resolve(htmlPath))
let html
try {
// load index.html to pull scripts/links necessary for proper offline reload
html = fs.readFileSync(path.resolve(htmlPath))
} catch (err) {
// ENOENT means the file doesn't exist, which is to be expected when trying
// to open 404.html if the user hasn't created a custom 404 page -- return
// an empty array.
if (err.code === `ENOENT`) {
return []
} else {
throw err
}
}

// party like it's 2006
const $ = cheerio.load(html)

// holds any paths for scripts and links
const criticalFilePaths = []

$(`script[src], link[as=script], link[as=font], link[as=fetch]`).each(
(_, elem) => {
const $elem = $(elem)
const url = $elem.attr(`src`) || $elem.attr(`href`)
const blackListRegex = /\.xml$/

if (!blackListRegex.test(url)) {
let path = url
if (url.substr(0, 4) !== `http`) {
path = `public${url}`
}
$(`
script[src],
link[as=script],
link[as=font],
link[as=fetch],
link[rel=stylesheet],
style[data-href]
`).each((_, elem) => {
const $elem = $(elem)
const url =
$elem.attr(`src`) || $elem.attr(`href`) || $elem.attr(`data-href`)
const blackListRegex = /\.xml$/

criticalFilePaths.push(path)
if (!blackListRegex.test(url)) {
let path = url
if (url.substr(0, 4) !== `http`) {
path = `public${url}`
}

criticalFilePaths.push(path)
}
)
})

return _.uniq(criticalFilePaths)
}
48 changes: 48 additions & 0 deletions packages/gatsby/cache-dir/load-directly-or-404.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/**
* When other parts of the code can't find resources for a page, they load the 404 page's
* resources (if it exists) and then pass them here. This module then does the following:
* 1. Checks if 404 pages resources exist. If not, just navigate directly to the desired URL
* to show whatever server 404 page exists.
* 2. Try fetching the desired page to see if it exists on the server but we
* were just prevented from seeing it due to loading the site from a SW. If this is the case,
* trigger a hard reload to grab that page from the server.
* 3. If the page doesn't exist, show the normal 404 page component.
* 4. If the fetch failed (generally meaning we're offline), then navigate anyways to show
* either the browser's offline page or whatever the server error is.
*/
export default function(resources, path) {
return new Promise(resolve => {
const url = new URL(window.location.origin + path)

// Append the appropriate query to the URL.
if (url.search) {
url.search += `&no-cache=1`
} else {
url.search = `?no-cache=1`
}

// Always navigate directly if a custom 404 page doesn't exist.
if (!resources) {
window.location = url
} else {
// Now test if the page is available directly
fetch(url.href)
.then(response => {
if (response.status !== 404) {
// Redirect there if there isn't a 404. If a different HTTP
// error occurs, the appropriate error message will be
// displayed after loading the page directly.
window.location.replace(url)
} else {
// If a 404 occurs, show the custom 404 page.
resolve()
}
})
.catch(() => {
// If an error occurs (usually when offline), navigate to the
// page anyway to show the browser's proper offline error page
window.location = url
})
}
})
}
39 changes: 6 additions & 33 deletions packages/gatsby/cache-dir/loader.js
Original file line number Diff line number Diff line change
Expand Up @@ -276,37 +276,6 @@ const queue = {

getPage: pathname => findPage(pathname),

// If we're loading from a service worker (it's already activated on
// this initial render) and we can't find a page, there's a good chance
// we're on a new page that this (now old) service worker doesn't know
// about so we'll unregister it and reload.
checkIfDoingInitialRenderForSW: path => {
if (
inInitialRender &&
navigator &&
navigator.serviceWorker &&
navigator.serviceWorker.controller &&
navigator.serviceWorker.controller.state === `activated`
) {
if (!findPage(path)) {
navigator.serviceWorker
.getRegistrations()
.then(function(registrations) {
// We would probably need this to
// prevent unnecessary reloading of the page
// while unregistering of ServiceWorker is not happening
if (registrations.length) {
for (let registration of registrations) {
registration.unregister()
}

window.location.reload()
}
})
}
}
},

getResourcesForPathnameSync: path => {
const page = findPage(path)
if (page) {
Expand All @@ -320,8 +289,6 @@ const queue = {
// if necessary and then the code/data bundles. Used for prefetching
// and getting resources for page changes.
getResourcesForPathname: path => {
queue.checkIfDoingInitialRenderForSW(path)

return new Promise((resolve, reject) => {
const doingInitialRender = inInitialRender
inInitialRender = false
Expand Down Expand Up @@ -352,6 +319,12 @@ const queue = {

if (!page) {
console.log(`A page wasn't found for "${path}"`)

// Preload the custom 404 page when running `gatsby develop`
if (path !== `/404.html` && process.env.NODE_ENV !== `production`) {
queue.getResourcesForPathname(`/404.html`)
}

return resolve()
}

Expand Down
7 changes: 5 additions & 2 deletions packages/gatsby/cache-dir/navigation.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import emitter from "./emitter"
import { globalHistory } from "@reach/router/lib/history"
import { navigate as reachNavigate } from "@reach/router"
import parsePath from "./parse-path"
import loadDirectlyOr404 from "./load-directly-or-404"

// Convert to a map for faster lookup in maybeRedirect()
const redirectMap = redirects.reduce((map, redirect) => {
Expand Down Expand Up @@ -72,10 +73,12 @@ const navigate = (to, options) => {

loader.getResourcesForPathname(pathname).then(pageResources => {
if (!pageResources && process.env.NODE_ENV === `production`) {
loader.getResourcesForPathname(`/404.html`).then(() => {
loader.getResourcesForPathname(`/404.html`).then(resources => {
clearTimeout(timeoutId)
onPreRouteUpdate(window.location)
reachNavigate(to, options).then(() => onRouteUpdate(window.location))
loadDirectlyOr404(resources, to).then(() =>
reachNavigate(to, options).then(() => onRouteUpdate(window.location))
)
})
} else {
onPreRouteUpdate(window.location)
Expand Down
Loading