Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2 API — Async snapshots #30

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
96 changes: 61 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,20 @@
# React Snapshot

A zero-configuration static pre-renderer for React apps. Starting by targeting Create React App (because it's great)
A zero-configuration static pre-renderer for React apps, built for Create React App (because it's great)

## The Premise

Server-side rendering is a big feature of React, but for most apps it can be more trouble than its worth. Personally, I think the sweet spot is taking static site snapshots of all your publicly-accessible pages & leaving anything requiring authentication as a normal, JS-driven Single Page App.
Server-side rendering is a big feature of React, but for most apps it can be more trouble than its worth. You have to choose between serving your users a blank page until the JS loads, or set up the infrastructure required to generate HTML on a server—suddenly your application code needs to be aware that some of the time browser APIs will be available and other times it's NodeJS & you need to start describing more precisely which routes load which data. Either you optimise for Developer Experience (DX) or User Experience (UX), which is a bad tradeoff to be making.

This is a project to do that. Automatically, without any configuration, just smart defaults. **Retrospective progressive enhancement.**
Thankfully, the same mechanics that make React work as a static HTML page rendered by a server lets us do something much simpler—rather than rendering HTML dynamically on a server, render it ahead of time during the build & deployment phase. Then, take these static HTML snapshots and host them anywhere, no server required.

This is a project to do that. Automatically, without any configuration, and only a few tiny changes to your application code.

The snapshots still have the normal JS bundle included, so once that downloads the site will function exactly as before (i.e. instantaneous page transitions), but you serve real, functional HTML & CSS as soon as possible. It's good for SEO (yes Google crawls SPAs now but they still reward perf and this perfs like a banshee), it's good if your JS is broken or something render-blocking has a network fail, it's good for accessibility, it's good for Slackbot or Facebook to read your opengraph tags, it's just good.

## The How To

- First, `npm i -D react-snapshot`
- First, `npm i -D react-snapshot@next` (for v2)
- Second, open your package.json and change `"scripts"` from

```diff
Expand All @@ -33,9 +35,63 @@ The snapshots still have the normal JS bundle included, so once that downloads t
);
```

This calls `ReactDOM.render` in development and `ReactDOMServer.renderToString` when prerendering. If I can make this invisible I will but I can't think how at the moment.
For a static site, that's it! During `build`, react-snapshot will load up your site using JSDOM, crawl it to find all the pages, render each, calculate the React checksum to minimise work on the client, and save the files out to be served by something like [surge.sh](https://surge.sh).

## Dynamic Data

If a route has to fetch data from somewhere, you need to tell react-snapshot about it. But thankfully, that's as easy as:

```diff
+ import { snapshot } from 'react-snapshot'

class Home extends React.Component {
state = { quotes: null }

componentWillMount() {
+ snapshot(() => (
fetch('/api/quotes')
.then(response => response.json())
+ ))
.then(quotes => {
this.setState({ quotes })
})
}

render() {
return this.state
return (
<div className="Quotes">
{
quotes && quotes.map((quote, i) => <Quote key={i} quote={quote}/>)
}
</div>
)
}
}
```

Wrap any async process you want to track in a `snapshot` call and it'll be tracked. During deployment, the fetch is performed, the return value is stored & sent as JSON in the HTML snapshot. When the app gets booted on the client, that same snapshot method short-circuits, immediately calling the .then and the state gets populated before the render method is called. This means there's no flash, the checksum matches, and everything is JUST GREAT™.

*Note: you have to use `componentWillMount` instead of `componentDidMount`, since the latter runs *after* the render method, and you'll get a flash.

Since this pattern is quite common, there's also the `Snapshot` higher-order component that lets you treat async dependencies as props:

```js
const Home = ({ quotes }) => (
<div className="Quotes">
{
quotes && quotes.map((quote, i) => <Quote key={i} quote={quote}/>)
}
</div>
)

export default Snapshot({
quotes: () => fetch('/api/quotes').then(resp => resp.json())
}).rendering(Home)
```

## Options

You can specify additional paths as entry points for crawling that would otherwise not be found. It's also possible to exclude particular paths from crawling. Simply add a section called `"reactSnapshot"` to your package.json.

```
Expand All @@ -62,36 +118,6 @@ Check out [create-react-app-snapshot.surge.sh](https://create-react-app-snapshot

The [diff from the original create-react-app code](https://github.com/geelen/create-react-app-snapshot/compare/303f774...master) might be enlightening to you as well.

## The Implementation

It's pretty simple in principle:

- Fire up the home page in a fake browser and snapshot the HTML once the page is rendered
- Follow every relative URL to crawl the whole site
- Repeat.

There's a few more steps to it, but not much.

- We move `build/index.html` to `build/200.html` at the beginning, because it's a nice convention. Hosts like [surge.sh](https://surge.sh) understand this, serving `200.html` if no snapshot exists for a URL. If you use a different host I'm sure you can make it do the same.
- `pushstate-server` is used to serve the `build` directory & serving `200.html` by default
- The fake browser is JSDOM, set to execute any local scripts (same origin) in order to actually run your React code, but it'll ignore any third-party scripts (analytics or social widgets)
- We start a new JSDOM session for each URL to ensure that each page gets the absolute minimum HTML to render it.

## The Caveats

This is a hacky experiment at the moment. I would really like to see how far we can take this approach so things "just work" without ever adding config. Off the top of my head:

- [x] ~~Waiting on [pushstate-server#29](https://github.com/scottcorgan/pushstate-server/pull/29). Right now `pushstate-server` serves `200.html` _even if_ a HTML snapshot is present. So once you've run `react-snapshot`, you have to switch to `http-server` or `superstatic` to test if it worked. Or you could just push to [surge.sh](https://surge.sh) each time, which isn't too bad.~~
- [x] ~~Is starting at `/` and crawling sufficient? Might there be unreachable sections of your site?~~
- [x] ~~Should we exclude certain URLs? Maybe parse the `robots.txt` file?~~
- [ ] What if you don't want the `200.html` pushstate fallback? What if you want to remove the bundle (effectively making this a static site generator)?
- [ ] This doesn't pass down any state except what's contained in the markup. That feels ok for simple use-cases (you can always roll your own) but if you have a use-case where you need it and want zero-config raise an issue.
- [x] #2 ~~I'm using a regexp to parse URLs out of the HTML because I wrote this on a flight with no wifi and couldn't NPM install anything. We should use a real parser. You should submit a PR to use a real parser. That would be real swell.~~
- [ ] Should we clone the `build` directory to something like `snapshot` or `dist` instead of modifying it in-place?
- [ ] There's virtually no error checking things so will just explode in interesting ways. So yeah that should be fixed.
- [ ] Is JSDOM gonna hold us back at some point?
- [ ] If the React code is changing what it renders based on size of viewport then things may "pop in" once the JS loads. Anything driven by media queries should just work though. So stick to Media Queries, I guess?
- [ ] Does someone else want to take this idea and run with it? I would be 100% happy to not be the maintainer of this project :)

## The Alternatives

Expand Down
7 changes: 4 additions & 3 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
{
"name": "react-snapshot",
"version": "1.1.0",
"version": "2.0.0-1",
"description": "",
"main": "lib/index.js",
"repository": "geelen/react-snapshot",
"bin": {
"react-snapshot": "./bin/react-snapshot.js"
},
"scripts": {
"build": "babel --out-dir lib src",
"build:watch": "npm run build -- --watch",
"babel": "babel --out-dir lib src",
"build": "NODE_ENV=production npm run babel",
"build:watch": "NODE_ENV=development npm run babel -- --watch",
"prepublish": "rm -rf lib/* && npm run build"
},
"dependencies": {
Expand Down
3 changes: 2 additions & 1 deletion src/cli.js
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ export default () => {
const writer = new Writer(buildDir)
writer.move('index.html', '200.html')

const server = new Server(buildDir, basename, 0, pkg.proxy)
const proxy = process.env.REACT_SNAPSHOT_PROXY || pkg.proxy
const server = new Server(buildDir, basename, 0, proxy)
server.start().then(() => {
const crawler = new Crawler(`http://localhost:${server.port()}${basename}`, options.snapshotDelay, options)
return crawler.crawl(({ urlPath, html }) => {
Expand Down
101 changes: 95 additions & 6 deletions src/index.js
Original file line number Diff line number Diff line change
@@ -1,11 +1,100 @@
import ReactDOM from 'react-dom';
import ReactDOMServer from 'react-dom/server';
import React from 'react'
import ReactDOM from 'react-dom'

export const IS_REACT_SNAPSHOT = navigator.userAgent.match(/Node\.js/i) && window && window.react_snapshot_render

const state = {
requests: [],
data: window.react_snapshot_state || {},
count: 0
}

export const render = (rootComponent, domElement) => {
if (navigator.userAgent.match(/Node\.js/i) && window && window.reactSnapshotRender) {
domElement.innerHTML = ReactDOMServer.renderToString(rootComponent)
window.reactSnapshotRender()
window.rootComponent = rootComponent
ReactDOM.render(rootComponent, domElement)
if (IS_REACT_SNAPSHOT) {
window.react_snapshot_render(domElement, state, rootComponent)
}
}

const _snapshot = (func, repeat) => {
const i = state.count++
const existing = state.data[i]
if (existing) {
const { success, failure } = existing
/* This mimics a Promise API but is entirely synchronous */
return {
then(resolve, reject) {
if (typeof success !== 'undefined') resolve(success)
else if (!repeat && reject && typeof failure !== 'undefined') reject(failure)
if (repeat) func().then(resolve, reject)
},
catch(reject) {
if (!repeat && typeof failure !== 'undefined') reject(success)
if (repeat) func().catch(reject)
}
}
} else {
ReactDOM.render(rootComponent, domElement)
if (!IS_REACT_SNAPSHOT) return func()
const promise = func().then(
success => {
state.data[i] = { success }
return success
},
failure => {
state.data[i] = { failure }
return Promise.reject(failure)
}
)
state.requests.push(promise)
return promise
}
}
export const snapshot = func => _snapshot(func, false)
snapshot.repeat = func => _snapshot(func, true)

const _Snapshot = (prop_defs, repeat_on_client) => {
const prop_names = Object.keys(prop_defs)
if (typeof prop_defs !== "object" ||
prop_names.some(k => typeof prop_defs[k] !== 'function')
) throw new Error("Snapshot requires an object of type { propName: () => Promise }.")
console.log(prop_defs)

const hoc = (Component, render_without_data) => {
class SnapshotComponent extends React.Component {
constructor() {
super()
this.state = { loaded_all: false, async_props: null }
}

componentWillMount() {
_snapshot(
() => Promise.all(prop_names.map(prop_name => prop_defs[prop_name](this.props))),
repeat_on_client
).then(responses => {
const new_state = {}
prop_names.forEach((prop_name, i) => new_state[prop_name] = responses[i])
this.setState({ async_props: new_state, loaded_all: true })
})
}

render() {
if (!this.state.loaded_all && !render_without_data) return null
const props = Object.assign({},
this.props,
this.state.async_props
)
return React.createElement(Component, props)
}
}
SnapshotComponent.displayName = `Snapshot(${Component.displayName || Component.name})`
return SnapshotComponent
}
return {
thenRender: Component => hoc(Component, false),
rendering: Component => hoc(Component, true)
}
}

export const Snapshot = prop_defs => _Snapshot(prop_defs, false)
Snapshot.repeat = prop_defs => _Snapshot(prop_defs, true)
57 changes: 49 additions & 8 deletions src/snapshot.js
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
/* Wraps a jsdom call and returns the full page */

import jsdom from 'jsdom'
//import * as ReactMarkupChecksum from 'react-dom/lib/ReactMarkupChecksum'
//import escapeTextContentForBrowser from 'react-dom/lib/escapeTextContentForBrowser'
//import adler32 from 'react-dom/lib/adler32'
//const TEXT_NODE = 3
import ReactDOMServer from 'react-dom/server'

export default (protocol, host, path, delay) => {
return new Promise((resolve, reject) => {
let reactSnapshotRenderCalled = false
let render_called = false
jsdom.env({
url: `${protocol}//${host}${path}`,
headers: { Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" },
Expand All @@ -23,17 +28,53 @@ export default (protocol, host, path, delay) => {
virtualConsole: jsdom.createVirtualConsole().sendTo(console),
created: (err, window) => {
if (err) reject(err)
window.reactSnapshotRender = () => {
reactSnapshotRenderCalled = true
setTimeout(() => {
resolve(window)
}, delay)
window.react_snapshot_render = (element, state, rootComponent) => {
render_called = { element, state, rootComponent }
}
},
done: (err, window) => {
if (!reactSnapshotRenderCalled) {
reject("'render' from react-snapshot was never called. Did you replace the call to ReactDOM.render()?")
if (!render_called) {
return reject("'render' from react-snapshot was never called. Did you replace the call to ReactDOM.render()?")
}

const { element, state, rootComponent } = render_called

const next = () => {
const shift = state.requests.shift()
return shift && shift.then(next)
}
/* Wait a short while, then wait for all requests, then serialise */
new Promise(res => setTimeout(res, delay))
.then(next)
.then(() => {
// This approach is really difficult to get working reliably

//Array.from(element.querySelectorAll('*')).forEach(el => {
// const instance_key = Object.keys(el).find(k => k.startsWith('__reactInternalInstance'))
// if (instance_key) el.setAttribute('data-reactid', el[instance_key]._domID)
// if (el.hasChildNodes()) {
// for (let i = 0; i < el.childNodes.length; i++) {
// const tn = el.childNodes[i]
// if (tn.nodeType === TEXT_NODE) tn.data = escapeTextContentForBrowser(tn.textContent)
// }
// }
//})
//

//const markup = element.innerHTML
//console.log(adler32(markup))
//console.log(markup)
//element.innerHTML = ReactMarkupChecksum.addChecksumToMarkup(markup)

// This approach is much more reliable but is it too confusing??
state.count = 0
element.innerHTML = ReactDOMServer.renderToString(rootComponent)

window.document.body.insertAdjacentHTML('afterBegin', `
<script>window.react_snapshot_state = ${JSON.stringify(state.data)};</script>
`)
resolve(window)
})
}
})
})
Expand Down
Loading