Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUMM-1276 VitalCPUReader is implemented #514

Merged
merged 3 commits into from
Jun 14, 2021

Conversation

buranmert
Copy link
Contributor

@buranmert buranmert commented Jun 4, 2021

What and why?

VitalCPUReader is implemented in order to measure CPU utilization of the app.

How?

host_statistics(...) method with HOST_CPU_LOAD_INFO parameter returns CPU utilization data from the kernel.
This low-level API requires working with pointers and type conversions; however

  1. there is no higher level API to get such information
  2. as this is the kernel, hopefully it's likely that they don't change often.

TODO

  • testWhenCPUUnderHeavyLoad test case fails quite often, this must be fixed first kernel fails to report too small intervals, for loop counts increased.

Review checklist

  • Feature or bugfix MUST have appropriate tests (unit, integration)
  • Make sure each commit and the PR mention the Issue number or JIRA reference

@buranmert buranmert self-assigned this Jun 4, 2021
*/
let userTicks = cpuLoadInfo.cpu_ticks.0
// systemTicks is always 0 (tested in iOS 14.4)
let systemTicks = cpuLoadInfo.cpu_ticks.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to take system ticks into account ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at first i thought systemTicks might be meaningful in case of background network tasks where URLSession spawns another process to load the URL.
but then we decided to ignore cpu ticks at background state as they are unreliable, so yes i should remove systemTicks 👍

it's always 0 in an iOS device and under normal conditions (foreground state) anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/opensource-apple/xnu/blob/master/osfmk/kern/timer.h#L65
this should be the explanation why macOS reports both user and sys ticks while iOS reports user ticks only

https://github.com/opensource-apple/xnu/blob/master/osfmk/kern/host.c#L455
and this is probably where iOS returns false and puts all ticks into CPU_STATE_USER in the else clause

@buranmert buranmert force-pushed the buranmert/RUMM-1276-cpu-vital-reader branch from 3e285f6 to 1e2fafe Compare June 9, 2021 10:35
@buranmert buranmert marked this pull request as ready for review June 9, 2021 10:36
@buranmert buranmert requested a review from a team as a code owner June 9, 2021 10:36
Copy link
Member

@ncreated ncreated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍, but we need to reset unwanted changes to pbxproj.

Datadog/Datadog.xcodeproj/project.pbxproj Outdated Show resolved Hide resolved
Sources/Datadog/RUM/RUMVitals/VitalCPUReader.swift Outdated Show resolved Hide resolved
Sources/Datadog/RUM/RUMVitals/VitalCPUReader.swift Outdated Show resolved Hide resolved
measured cpu ticks during inactive app state is not reliable
it doesn't give 600 ticks per second as it's supposed to.
therefore we ignore/subtract them from total.
also PR comments addressed.
@buranmert buranmert force-pushed the buranmert/RUMM-1276-cpu-vital-reader branch 2 times, most recently from 9fcfb1c to f82a004 Compare June 10, 2021 12:11
@buranmert buranmert requested a review from ncreated June 10, 2021 12:11
@buranmert buranmert force-pushed the buranmert/RUMM-1276-cpu-vital-reader branch from f82a004 to ba9465c Compare June 10, 2021 16:15
let testNotificationCenter = NotificationCenter()
lazy var cpuReader = VitalCPUReader(notificationCenter: testNotificationCenter)

func test_whenCPUUnderHeavyLoad_itMeasuresHigherCPUTicks() throws {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming convention:

Suggested change
func test_whenCPUUnderHeavyLoad_itMeasuresHigherCPUTicks() throws {
func testWhenCPUUnderHeavyLoadItMeasuresHigherCPUTicks() throws {

XCTAssertGreaterThan(highLoadAverage, lowLoadAverage)
}

func test_whenInactiveAppState_itIggnoresCPUTicks() throws {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo & naming convention:

Suggested change
func test_whenInactiveAppState_itIggnoresCPUTicks() throws {
func testWhenInactiveAppStateItIgnoresCPUTicks() throws {


@objc
private func appWillResignActive() {
utilizedTicksWhenResigningActive = readUtilizedTicks()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I looked further, I can see that this will lead to threading issue. The readVitalData() will be accessed from background thread (by RUMScope), but the appWillResignActive() will be called on the main thread. Both access the utilizedTicksWhenResigningActive state by either reading or writing.

We can solve it later, when actually plugging VitalReaders to RUM (this will pop up in threading tests), but it's worth leaving this note.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's a good point and i'm still not sure where the VitalReaders will live.
if they live in RUMMonitor we can read data in main thread within rum.start/stopView(...) methods (even before process(command:)) and that probably gives the most accurate measurement.
if they live in the scopes then we will need to add extra code as you said.

i'm leaving a comment 👍

Thread.sleep(forTimeInterval: 1.0)
}

XCTAssertGreaterThan(highLoadAverage, lowLoadAverage)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test seems stable when running on a Mac, but when running on an iPhone, I receive ~13% failures. And it's coming from a reason that the highLoadAverage is sometimes 0. While we can call the last 3 failures a "flakiness" factor, the first 10 do rather look like some measurement issue, WDYT?

Screenshot 2021-06-11 at 08 39 30

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now i'm running heavyLoad() for 1.0 second. that should improve test results.

although usually host_statistics updates its values after 0.01 second (which makes sense when we look at its source code), it sometimes keeps reporting the same values for 1.0 second (which doesn't make sense to me).
never 2.0 seconds or 0.5 second. i still can't figure out why.

i just noticed that when i do Thread.sleep(...), host_statistics returns updated values after 0.01 second most of the time.
without Thread.sleep(...) it usually doesn't update for 1.0 second.

@buranmert buranmert force-pushed the buranmert/RUMM-1276-cpu-vital-reader branch from ba9465c to 22c9ba0 Compare June 12, 2021 10:29
@buranmert buranmert requested a review from ncreated June 12, 2021 10:29
@buranmert buranmert merged commit 1fe60d4 into master Jun 14, 2021
@buranmert buranmert deleted the buranmert/RUMM-1276-cpu-vital-reader branch June 14, 2021 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants