-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(analytics): Add client-side functionality for analytics collection #5249
base: develop
Are you sure you want to change the base?
Conversation
ffba1b4
to
d86b7c7
Compare
# - Manifest count | ||
|
||
. "$PSScriptRoot\..\lib\json.ps1" # 'ConvertToPrettyJson' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(!(Resolve-DnsName analytics.scoop.sh -ErrorAction SilentlyContinue)) { | |
Write-Host "Could not resolve analytics.scoop.sh" | |
exit 0 | |
} | |
Just skip the whole script if analytics.scoop.sh can't be resolved (because it will get added to public blocklists 😄)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good idea. But I notice that it takes 10 seconds to time out, isn't that a bit much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it returns instantly for me. Maybe because I use a local Adguard in my network. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that it takes time when connected to my Azure VPN server. On other networks it returns instantly.
It's useful regardless, so I'll add it.
Could you tell why? I'll make necessary changes here if possible. |
Just don't like telemetry 😄 Of course, this analysis could help users pick up useful apps and help app developers know how popular their apps are. Apps tend to have looooooooooong EULA to say they don't collect user-identifiable data but who knows. BTW, why not generate random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just static code review.
'TEAMCITY_VERSION' = 'TeamCity' | ||
'TRAVIS' = 'Travis CI' | ||
}.GetEnumerator()) { | ||
if (-not [String]::IsNullOrEmpty((Get-Item "Env:/$($ci_env.Key)" -ErrorAction Ignore).Value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (-not [String]::IsNullOrEmpty((Get-Item "Env:/$($ci_env.Key)" -ErrorAction Ignore).Value)) { | |
if ((Get-Item "Env:/$($ci_env.Key)" -ErrorAction Ignore).Value) { |
❯ if ('') { $true } else { $false }
False
And why there's /
after Env:
?
} | ||
} | ||
foreach ($ci_env in 'BUILD_ID', 'CI') { | ||
if (-not [String]::IsNullOrEmpty((Get-Item "Env:/$($ci_env)" -ErrorAction Ignore).Value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
if ([String]::IsNullOrEmpty((get_config ANALYTICS_ID))) { | ||
set_config ANALYTICS_ID (New-Guid).Guid | Out-Null | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use random ID is better IMO
# Known sources | ||
if ($source -in $known_sources) { | ||
return $true | ||
} | ||
# Local file paths, SSH remotes, and remotes with usernames | ||
if ($source -match '^/[A-Za-z]:/|^[A-Za-z]:/|^\./|^\.\./|file:/|ssh:/|@') { | ||
return $false | ||
} | ||
return $true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Known sources | |
if ($source -in $known_sources) { | |
return $true | |
} | |
# Local file paths, SSH remotes, and remotes with usernames | |
if ($source -match '^/[A-Za-z]:/|^[A-Za-z]:/|^\./|^\.\./|file:/|ssh:/|@') { | |
return $false | |
} | |
return $true | |
# Local file paths, SSH remotes, and remotes with usernames | |
if ($source -match '^/[A-Za-z]:/|^[A-Za-z]:/|^\./|^\.\./|file:/|ssh:/|@') { | |
return $false | |
} else { | |
return $true | |
} |
set_config ANALYTICS_ID (New-Guid).Guid | Out-Null | ||
} | ||
$def_arch = Get-DefaultArchitecture | ||
$known_sources = foreach ($item in (known_bucket_repos).PSObject.Properties) { $item.Value } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$known_sources = foreach ($item in (known_bucket_repos).PSObject.Properties) { $item.Value } |
$stats.machine = [ordered]@{ | ||
Build = [System.Environment]::OSVersion.Version.ToString() | ||
Arch = $def_arch | ||
Desktop = (Get-ItemProperty -Path HKLM:\SOFTWARE\Microsoft\PowerShell\3\PowerShellEngine -Name 'PowerShellVersion').PowerShellVersion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All that information is also available in $PSVersionTable
Arch = $def_arch | ||
Desktop = (Get-ItemProperty -Path HKLM:\SOFTWARE\Microsoft\PowerShell\3\PowerShellEngine -Name 'PowerShellVersion').PowerShellVersion | ||
Core = if (Get-Command pwsh -ErrorAction Ignore) { | ||
(Get-Item (Get-Command pwsh).Source).VersionInfo.ProductVersionRaw.ToString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Get-Item (Get-Command pwsh).Source).VersionInfo.ProductVersionRaw.ToString() | |
(Get-Command pwsh).Version.ToString() |
$branch = (Get-Content "$PSScriptRoot\..\.git\HEAD").Replace('ref: ', '') | ||
"$(Get-Content (Join-Path "$PSScriptRoot\..\.git" $branch)) ($($branch.Split('/')[-1]))" | ||
} elseif (Test-Path "$PSScriptRoot\..\CHANGELOG.md") { | ||
(Select-String '^## .*([\d]{4}-[\d]{2}-[\d]{2})' "$PSScriptRoot\..\CHANGELOG.md").Matches.Groups[1].Value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Select-String '^## .*([\d]{4}-[\d]{2}-[\d]{2})' "$PSScriptRoot\..\CHANGELOG.md").Matches.Groups[1].Value | |
(Select-String -Pattern '^## \[v([\d.]+)\].*?([\d-]+)$' -Path "$PSScriptRoot\..\CHANGELOG.md").Matches.Groups[1].Value |
The version number is better. I don't think there's user used v0.1.0 and lower.
} elseif (Test-Path "$PSScriptRoot\..\CHANGELOG.md") { | ||
(Select-String '^## .*([\d]{4}-[\d]{2}-[\d]{2})' "$PSScriptRoot\..\CHANGELOG.md").Matches.Groups[1].Value | ||
} else { | ||
"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above, the CHANGELOG.md
should be there, so this will be never reached.
$stats.apps += $newitem | ||
} | ||
|
||
$payload = $stats | ConvertToPrettyJSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The payload needn't be prettified, right?
I have some internal apps and buckets hosted on secret but publicly accessible https sites, would this script expose it? I think it would be better to only report on officially maintained apps and buckets |
I also want to point out that even though I am personally fine with some level of analytics, it is still a bit odd to add them without much warning down the line. I would suggest an opt-in, but is on by default with sufficient notice on new installs. I could also see tracking private usage would be useful for detecting malware using scoop. In which case this data should not be made public and rather sent to security researchers if something looks unusual. |
"machine": {
"Build": "10.0.22621.0",
"Arch": "64bit",
"Desktop": "5.1.22621.1",
"Core": "7.3.2.0",
"Scoop": "d86b7c72e281375e8209f8f23fdedfee790cd9a6 (5249--analytics)"
}, Commit hash could be used to link the data to a Github user via the search function. (e.g. data is send from a user working on a PR) |
TODO: Remove analytics_id and analytics_timestamp from scoop-export.ps1
Description
This PR adds the client-side functionality for sending anonymous usage statistics to a remote server.
Script is disabled for now and will be enabled after the server-side code has been written, open-sourced (under the ScoopInstaller org) and deployed to analytics.scoop.sh where stats will be available for everyone to see (similar to https://formulae.brew.sh/analytics).
Motivation and Context
Relates to #3781
How Has This Been Tested?
Analytics are sent only:
ANALYTICS_DISABLE
config setting is not trueThe analytics script is triggered on each scoop subcommand (except
help
subcommands), and only after the above conditions are satisfied. The script runs asynchronously in background after the issued subcommand is complete, not disturbing it in any way.Following buckets (and their apps) are not sent:
Click here to see a sample payload (from my own machine)
As you can see, the data is completely anonymous and contains no user-identifiable information.
Checklist:
develop
branch.