-
Notifications
You must be signed in to change notification settings - Fork 2
Track app usage #25
Track app usage #25
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments around how we can try to make sure our users know what they're opting into.
I think this file should be restructured since there's so much going on.
Probably something like a 'start' directory with the chunks of logic in separate file with the start.js file orchestrating the chunks?
This is assuming we'd want to port this to the live prototype kit, if this is only for private beta nevermind.
start.js
Outdated
prompt.message = '' | ||
prompt.delimiter = '' | ||
|
||
prompt.get([{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if someone does not read this and presses enter, which one is the default?
Normally CLI applications will do something like
'(y/N)' where the capitalised answer is what would be sent when pressing enter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no default. It says 'Please enter y or n' - this is the same as when we prompt for a new port number if a kit is already running
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah okay :)
start.js
Outdated
|
||
prompt.get([{ | ||
name: 'answer', | ||
description: 'Help us improve the Prototpe Kit\nDo you give permission for the kit to record anonymous usage data? (y/n)', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we link to some form of privacy policy that:
- details our strategy for tracking e.g. using Google Analytics
- what details we are capturing
- how to opt-out if you've opted-in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeh agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should specify what data we are collecting. According to npm, it's "Pageviews, Screenviews, Events, Exceptions, User timings, Transactions"
I wonder if once the user has accepted tracking, the notification that they're being tracked should be shown in the console every time they run the kit, instead of being shown just the once when they accept? It would make it a bit more visible.
I agree with @NickColley that there's a lot going on in start.js
and the functionality should be structured into smaller chunks. But if this is just proof of concept then it's cool.
start.js
Outdated
trackingUser.set('uip','1.1.1.1') | ||
|
||
const kitVersion = packageJson.version | ||
const operatingSystem = os.platform() + ' ' + os.release() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, it would be good to check that os
is not null before getting platform
and release
. If querying for os
fails for any reason and it's null
, this throws up a fatal error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why os
would ever be null? It's a core Node module that's required further up in the file - we don't normally check those are null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke to @joelanman. I like to check for and catch null variables whenever they might arise. However I recognise in this case that's excessive since something's gone fundamentally wrong if os
returns null. Added a ticket to Dev Time as it would be good to understand how other team members handle errors and then add something to docs.
start.js
Outdated
startTracking() | ||
} else { | ||
// opted out | ||
checkFiles() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like checkFiles()
always runs regardless of the fork here (both askForUsageDataPermission()
and startTracking()
contain it) so maybe it could come after the if/else if
block and the else
condition can be omitted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to look for three states here and manage each one separately:
undefined
- we don't have a record of permission either way, and we need to ask
true
- we have permission and we should startTracking
else (false)
- we don't have permission and we skip over tracking, straight to checkFiles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @joelanman. I'm not sure the state matters in any way since checkFiles() always runs and we don't pass any arguments to it. Maybe I'm missing something 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @joelanman for talking me through this 🙂As askForUsageDataPermission()
contains an asynchronous call, running checkFiles()
which starts the kit is not the right thing to do.
There might be some opportunities here to refactor the functions, maybe by creating a new function startKit()
that calls both checkFiles()
and runGulp()
to make it clearer what's happening. And doing:
if (usageDataConfig.collectUsageData === undefined){
askForUsageDataPermission()
} else {
if (usageDataConfig.collectUsageData === true) {
// opted in
startTracking()
}
// run the kit
checkFiles()
}
But that's just extra niceness and shouldn't block the PR from being merged.
start.js
Outdated
|
||
const description = fs.readFileSync(path.join(__dirname, 'lib/usage-data-prompt.txt'), 'utf8') | ||
|
||
prompt.get([{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this only happen if description
is not null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would description
be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above - let's discuss error handling with the team.
@hannalaakso The data recorded is documented here: https://github.com/alphagov/govuk-prototype-kit-private-beta/blob/d8b51e8a92fce5a231370ddf1c9b6f0f995c385e/docs/documentation/usage-data.md Nice point about reminding people every start, I'll look into that. |
On reviewing this with @alex-ju, I'm not sure the Google Analytics approach can work. We want to know for example, what % of our users are on Windows. GA does this with Dimensions, which I don't think we can use (custom dimensions are managed by hand, and only have 200 different values). GA Events (the approach taken in this PR) are not reported on as above. They are reported on as Total events, or Unique events. Either one one show a user starting the kit twice on Windows to have 2 events, and a user starting once on OSX to have 1. This does not give us the data we need. |
Would this stop heroku from deploying the prototype kit since it could get stuck on this question? |
@NickColley it's a good question, but Heroku doesn't run |
Updated with using custom dimensions for storing persistent data (Operating System, Prototype Kit Version and Node Version). |
start.js
Outdated
const nodeVersion = process.versions.node | ||
|
||
// Anonymise the IP | ||
trackingUser.set('anonymizeIp', 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it's better to anonymise at our end, rather than asking Google to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on what we want to achieve by anonymising. By setting an IP ourselves we'll introduce an artificial location but we're sure the IP doesn't go to GA. If we anonymise IP we're not storing the IP but we get some insights that come with that (e.g. location). It's worth having a chat about that.
start.js
Outdated
trackingUser.set('cd3', nodeVersion) | ||
|
||
// Trigger start event | ||
trackingUser.event('State', 'Start', 'Application started').send() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need 'Application started'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well.. we need a label.. I'm not sure what's the best copy for it though.
The format is trackingUser.event(Category, Action, Label).send()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This page says label is not required
https://developers.google.com/analytics/devguides/collection/analyticsjs/events
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, not required. So if it's not helpful for us in the GA interface, I'll remove it :)
Updated with using custom dimensions for storing persistent data (Operating System, Prototype Kit Version and Node Version). Refactored the code. @NickColley can you take a look when you have time (no pressure)? |
@@ -0,0 +1,27 @@ | |||
# Collecting usage data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work.
Since third-parties are involved a user may consent to this data being collected but not consent to the third-party being involved in the process.
Would be good to be specific how this data is collected with which third-parties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the third-party to the 'How it works' and a bit more details to 'Data we collect'.
usage-data.md
start.js
Outdated
|
||
if (usageDataConfig.collectUsageData === undefined) { | ||
// didn't opted | ||
let promptPromise = new Promise(usageData.askForUsageDataPermission) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this block a heroku server from starting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @joelanman pointed out in a comment above it seems like Heroku relies on node ./node_modules/gulp/bin/gulp generate-assets && node server.js
and all our tracking work is in start.js
, then start.js
uses Gulp to call server.js
via nodemon.
Short answer, no, it shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking better 👍
One question about blocking heroku and specifying how we will track users and some non-blocking comments.
start.js
Outdated
|
||
if (usageDataConfig.collectUsageData === undefined) { | ||
// didn't opted | ||
let promptPromise = new Promise(usageData.askForUsageDataPermission) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking: I'd expect this 'new Promise' to live inside 'askForUsageDataPermission'
Something like
let promptPromise = usageData.askForUsageDataPermission()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, moved the promise inside the helper function.
@NickColley thanks for the review. tried to address your comments. please let me know if you feel it's good to go or you have any concerns. |
@alex-ju looks great 👍 |
Please don't merge while it has my test GA code on it :) |
@joelanman I'm aware. We can update the tracking code tomorrow. It's up to you when's the right time to merge it, since you opened this PR, just trying to make we have it prepared. |
721a653
to
e9349c6
Compare
@joelanman updated the tracking code. Looking at the Updated: records well! good to be merged from my side |
365a7c7
to
24d2b33
Compare
start.js
Outdated
console.error('ERROR: Node module folder missing. Try running `npm install`') | ||
process.exit(0) | ||
// Local dependencies | ||
const usageData = require('./lib/usage_data') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still breaks the npm install
check, as usage_data then requires modules.
I'd recommend calling checkFiles before this, outside of the if block below (all the cases call it, so it's good to refactor this anyway)
start.js
Outdated
const usageDataConfig = usageData.getUsageDataConfig() | ||
|
||
if (usageDataConfig.collectUsageData === undefined) { | ||
// didn't opted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't right - suggest
// No recorded answer, so ask for permission
start.js
Outdated
// opted in | ||
checkFiles() | ||
runGulp() | ||
usageData.startTracking(usageDataConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe we should startTracking before running gulp, makes more sense that way in my mind, but not sure it matters?
@joelanman thanks for the review! updated. |
@alex-ju it looks like checkFiles still happens after the require I highlighted? |
This helps the team working on the Kit understand how it's being used, in order | ||
to improve it - please don't turn off usage data unless you have to. | ||
|
||
## How it works |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to mention if this just happens locally, or if it will also happen on deployed versions.
It might be worth storing the date the user accepted the opt-in. Might be useful if you ever want to decide whether to send a new opt-in request. |
@edwardhorsford I think thats a good idea, but I think we can add it later. For now the file created date should be ok. It raises the question of whether the 'permission proof' living with the user is sufficient. If they ever lost it there would be no proof. I wonder whether in this case we could prove that data is only sent if a user explicitly opts in. |
@NickColley, can you give us a hand with the final review? I'm aware you already approved, but there were some small changes since then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave it a once over and looks good 👍
UPDATE
I had misunderstood custom dimensions, and we should use them to track operating system, node version, prototype kit version
https://docs.google.com/document/d/13or0ac7H8TdfWfMSt8xnsGgOuTV8jp7CuEtPa6qcP9U/edit#
This work adds opt-in tracking of app usage, using Google Analytics.
It asks for user's permission when first run.
The app then records the user's answer in
usage-data-config.js
.If the answer was yes:
It also records a unique user ID in the same file.
It sends an Event to Google Analytics, with:
I've refactored
start.js
slightly into a series of functions, this makes it a lot easier to handle the correct flow, based on whether the user is opted in or out.Known issues
Screens
When recorded in GA, the event looks like this: