Skip to content
This repository has been archived by the owner on Jul 17, 2018. It is now read-only.

Track app usage #25

Merged
merged 6 commits into from
Apr 18, 2018
Merged

Track app usage #25

merged 6 commits into from
Apr 18, 2018

Conversation

joelanman
Copy link
Contributor

@joelanman joelanman commented Feb 7, 2018

UPDATE

I had misunderstood custom dimensions, and we should use them to track operating system, node version, prototype kit version

https://docs.google.com/document/d/13or0ac7H8TdfWfMSt8xnsGgOuTV8jp7CuEtPa6qcP9U/edit#


This work adds opt-in tracking of app usage, using Google Analytics.

It asks for user's permission when first run.

The app then records the user's answer in usage-data-config.js.

If the answer was yes:

It also records a unique user ID in the same file.

It sends an Event to Google Analytics, with:

  • Category: 'Start'
  • Action: the version of the prototype kit
  • Value: the operating system type (eg Windows) and version

I've refactored start.js slightly into a series of functions, this makes it a lot easier to handle the correct flow, based on whether the user is opted in or out.

Known issues

  • Not tracking Node version
  • The prompt is a bit ugly and needs @amyhupe to check the text
  • I think we need a documentation page in the kit to tell people exactly what usage data is recorded, possibly linked to from the prompt

Screens

image

image

When recorded in GA, the event looks like this:

image

@joelanman joelanman requested a review from NickColley February 7, 2018 18:42
Copy link
Contributor

@NickColley NickColley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments around how we can try to make sure our users know what they're opting into.

I think this file should be restructured since there's so much going on.

Probably something like a 'start' directory with the chunks of logic in separate file with the start.js file orchestrating the chunks?

This is assuming we'd want to port this to the live prototype kit, if this is only for private beta nevermind.

start.js Outdated
prompt.message = ''
prompt.delimiter = ''

prompt.get([{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if someone does not read this and presses enter, which one is the default?

Normally CLI applications will do something like

'(y/N)' where the capitalised answer is what would be sent when pressing enter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no default. It says 'Please enter y or n' - this is the same as when we prompt for a new port number if a kit is already running

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah okay :)

start.js Outdated

prompt.get([{
name: 'answer',
description: 'Help us improve the Prototpe Kit\nDo you give permission for the kit to record anonymous usage data? (y/n)',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we link to some form of privacy policy that:

  • details our strategy for tracking e.g. using Google Analytics
  • what details we are capturing
  • how to opt-out if you've opted-in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh agree

Copy link
Member

@hannalaakso hannalaakso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should specify what data we are collecting. According to npm, it's "Pageviews, Screenviews, Events, Exceptions, User timings, Transactions"

I wonder if once the user has accepted tracking, the notification that they're being tracked should be shown in the console every time they run the kit, instead of being shown just the once when they accept? It would make it a bit more visible.

I agree with @NickColley that there's a lot going on in start.js and the functionality should be structured into smaller chunks. But if this is just proof of concept then it's cool.

start.js Outdated
trackingUser.set('uip','1.1.1.1')

const kitVersion = packageJson.version
const operatingSystem = os.platform() + ' ' + os.release()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, it would be good to check that os is not null before getting platform and release. If querying for os fails for any reason and it's null, this throws up a fatal error.

Copy link
Contributor Author

@joelanman joelanman Feb 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why os would ever be null? It's a core Node module that's required further up in the file - we don't normally check those are null

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke to @joelanman. I like to check for and catch null variables whenever they might arise. However I recognise in this case that's excessive since something's gone fundamentally wrong if os returns null. Added a ticket to Dev Time as it would be good to understand how other team members handle errors and then add something to docs.

start.js Outdated
startTracking()
} else {
// opted out
checkFiles()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like checkFiles() always runs regardless of the fork here (both askForUsageDataPermission() and startTracking() contain it) so maybe it could come after the if/else if block and the else condition can be omitted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to look for three states here and manage each one separately:

undefined - we don't have a record of permission either way, and we need to ask
true - we have permission and we should startTracking
else (false) - we don't have permission and we skip over tracking, straight to checkFiles

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @joelanman. I'm not sure the state matters in any way since checkFiles() always runs and we don't pass any arguments to it. Maybe I'm missing something 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @joelanman for talking me through this 🙂As askForUsageDataPermission() contains an asynchronous call, running checkFiles() which starts the kit is not the right thing to do.

There might be some opportunities here to refactor the functions, maybe by creating a new function startKit() that calls both checkFiles() and runGulp() to make it clearer what's happening. And doing:

if (usageDataConfig.collectUsageData === undefined){
  askForUsageDataPermission()
} else {
  if (usageDataConfig.collectUsageData === true) {
    // opted in
    startTracking()
  }
  // run the kit
  checkFiles()
}

But that's just extra niceness and shouldn't block the PR from being merged.

start.js Outdated

const description = fs.readFileSync(path.join(__dirname, 'lib/usage-data-prompt.txt'), 'utf8')

prompt.get([{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this only happen if description is not null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would description be null?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - let's discuss error handling with the team.

@joelanman
Copy link
Contributor Author

@hannalaakso The data recorded is documented here: https://github.com/alphagov/govuk-prototype-kit-private-beta/blob/d8b51e8a92fce5a231370ddf1c9b6f0f995c385e/docs/documentation/usage-data.md

Nice point about reminding people every start, I'll look into that.

@joelanman
Copy link
Contributor Author

On reviewing this with @alex-ju, I'm not sure the Google Analytics approach can work.

We want to know for example, what % of our users are on Windows. GA does this with Dimensions, which I don't think we can use (custom dimensions are managed by hand, and only have 200 different values).

GA Events (the approach taken in this PR) are not reported on as above. They are reported on as Total events, or Unique events. Either one one show a user starting the kit twice on Windows to have 2 events, and a user starting once on OSX to have 1. This does not give us the data we need.

@NickColley
Copy link
Contributor

Would this stop heroku from deploying the prototype kit since it could get stuck on this question?

@joelanman
Copy link
Contributor Author

@alex-ju
Copy link
Contributor

alex-ju commented Feb 28, 2018

Updated with using custom dimensions for storing persistent data (Operating System, Prototype Kit Version and Node Version).

start.js Outdated
const nodeVersion = process.versions.node

// Anonymise the IP
trackingUser.set('anonymizeIp', 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's better to anonymise at our end, rather than asking Google to do it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on what we want to achieve by anonymising. By setting an IP ourselves we'll introduce an artificial location but we're sure the IP doesn't go to GA. If we anonymise IP we're not storing the IP but we get some insights that come with that (e.g. location). It's worth having a chat about that.

start.js Outdated
trackingUser.set('cd3', nodeVersion)

// Trigger start event
trackingUser.event('State', 'Start', 'Application started').send()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need 'Application started'?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well.. we need a label.. I'm not sure what's the best copy for it though.
The format is trackingUser.event(Category, Action, Label).send()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, not required. So if it's not helpful for us in the GA interface, I'll remove it :)

@alex-ju
Copy link
Contributor

alex-ju commented Mar 6, 2018

Updated with using custom dimensions for storing persistent data (Operating System, Prototype Kit Version and Node Version).

Refactored the code. @NickColley can you take a look when you have time (no pressure)?

@@ -0,0 +1,27 @@
# Collecting usage data
Copy link
Contributor

@NickColley NickColley Mar 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work.

Since third-parties are involved a user may consent to this data being collected but not consent to the third-party being involved in the process.

Would be good to be specific how this data is collected with which third-parties.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the third-party to the 'How it works' and a bit more details to 'Data we collect'.
usage-data.md

start.js Outdated

if (usageDataConfig.collectUsageData === undefined) {
// didn't opted
let promptPromise = new Promise(usageData.askForUsageDataPermission)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this block a heroku server from starting?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @joelanman pointed out in a comment above it seems like Heroku relies on node ./node_modules/gulp/bin/gulp generate-assets && node server.js and all our tracking work is in start.js, then start.js uses Gulp to call server.js via nodemon.
Short answer, no, it shouldn't.

Copy link
Contributor

@NickColley NickColley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking better 👍

One question about blocking heroku and specifying how we will track users and some non-blocking comments.

start.js Outdated

if (usageDataConfig.collectUsageData === undefined) {
// didn't opted
let promptPromise = new Promise(usageData.askForUsageDataPermission)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: I'd expect this 'new Promise' to live inside 'askForUsageDataPermission'

Something like

let promptPromise = usageData.askForUsageDataPermission()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, moved the promise inside the helper function.

@alex-ju alex-ju changed the title DO NOT MERGE Track app usage Track app usage Mar 12, 2018
@alex-ju
Copy link
Contributor

alex-ju commented Mar 12, 2018

@NickColley thanks for the review. tried to address your comments. please let me know if you feel it's good to go or you have any concerns.

@NickColley
Copy link
Contributor

@alex-ju looks great 👍

@joelanman
Copy link
Contributor Author

Please don't merge while it has my test GA code on it :)

@alex-ju
Copy link
Contributor

alex-ju commented Mar 12, 2018

@joelanman I'm aware. We can update the tracking code tomorrow. It's up to you when's the right time to merge it, since you opened this PR, just trying to make we have it prepared.

@alex-ju alex-ju force-pushed the track-app branch 2 times, most recently from 721a653 to e9349c6 Compare March 27, 2018 08:39
@alex-ju
Copy link
Contributor

alex-ju commented Mar 27, 2018

@joelanman updated the tracking code. Looking at the Real-time events seems to be tracking well. I'll check again in a few minutes in the Event section to make sure they're stored correctly.

Updated: records well! good to be merged from my side

@alex-ju alex-ju force-pushed the track-app branch 2 times, most recently from 365a7c7 to 24d2b33 Compare March 28, 2018 10:56
start.js Outdated
console.error('ERROR: Node module folder missing. Try running `npm install`')
process.exit(0)
// Local dependencies
const usageData = require('./lib/usage_data')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still breaks the npm install check, as usage_data then requires modules.

I'd recommend calling checkFiles before this, outside of the if block below (all the cases call it, so it's good to refactor this anyway)

start.js Outdated
const usageDataConfig = usageData.getUsageDataConfig()

if (usageDataConfig.collectUsageData === undefined) {
// didn't opted
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't right - suggest
// No recorded answer, so ask for permission

start.js Outdated
// opted in
checkFiles()
runGulp()
usageData.startTracking(usageDataConfig)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe we should startTracking before running gulp, makes more sense that way in my mind, but not sure it matters?

@alex-ju
Copy link
Contributor

alex-ju commented Mar 28, 2018

@joelanman thanks for the review! updated.

@joelanman
Copy link
Contributor Author

@alex-ju it looks like checkFiles still happens after the require I highlighted?

This helps the team working on the Kit understand how it's being used, in order
to improve it - please don't turn off usage data unless you have to.

## How it works

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to mention if this just happens locally, or if it will also happen on deployed versions.

@edwardhorsford
Copy link

It might be worth storing the date the user accepted the opt-in. Might be useful if you ever want to decide whether to send a new opt-in request.

@joelanman
Copy link
Contributor Author

@edwardhorsford I think thats a good idea, but I think we can add it later. For now the file created date should be ok. It raises the question of whether the 'permission proof' living with the user is sufficient. If they ever lost it there would be no proof. I wonder whether in this case we could prove that data is only sent if a user explicitly opts in.

@alex-ju
Copy link
Contributor

alex-ju commented Apr 18, 2018

@NickColley, can you give us a hand with the final review? I'm aware you already approved, but there were some small changes since then.

Copy link
Contributor

@NickColley NickColley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it a once over and looks good 👍

@alex-ju alex-ju merged commit 81f6d9d into master Apr 18, 2018
@alex-ju alex-ju deleted the track-app branch April 18, 2018 13:16
@kr8n3r kr8n3r mentioned this pull request May 18, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants