Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build.cmd does not regenerate managed PGO data on changes #53637

Closed
jakobbotsch opened this issue Jun 2, 2021 · 5 comments · Fixed by #56397
Closed

build.cmd does not regenerate managed PGO data on changes #53637

jakobbotsch opened this issue Jun 2, 2021 · 5 comments · Fixed by #56397
Milestone

Comments

@jakobbotsch
Copy link
Member

StandardOptimizationData.mibc does not seem to be regenerated when it exists, even when the opt data package changes:

git checkout d435388^
rm .\artifacts\bin\coreclr\windows.x64.Checked\StandardOptimizationData.mibc
.\build.cmd clr -rc checked -lc release
(gci .\artifacts\bin\coreclr\windows.x64.Checked\StandardOptimizationData.mibc).Length
# prints 8000512
git checkout d435388
.\build.cmd clr -rc checked -lc release
(gci .\artifacts\bin\coreclr\windows.x64.Checked\StandardOptimizationData.mibc).Length
# prints 8000512 again
rm .\artifacts\bin\coreclr\windows.x64.Checked\StandardOptimizationData.mibc
.\build.cmd clr -rc checked -lc release
(gci .\artifacts\bin\coreclr\windows.x64.Checked\StandardOptimizationData.mibc).Length
# prints 9137664

d435388 is a commit where the PGO data changed significantly.

cc @davidwrighton @AndyAyersMS

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jun 2, 2021
@AndyAyersMS
Copy link
Member

Think this is a build system issue, somehow we need to understand that the optimization data is out of date -- basically the step that restores the data needs to be dependent on the props file or something, so it reruns if the props file changes.

@davidwrighton
Copy link
Member

@jakobbotsch do the files in artifacts/mibc change across those builds? I believe our build infra is designed to detect changes to those, but I'm not confident about the bits in artifacts/mibc getting updated correctly.

@jakobbotsch
Copy link
Member Author

@davidwrighton Yes, it seems like those files do change.

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2021
@mangod9 mangod9 added this to the 6.0.0 milestone Jul 6, 2021
@jakobbotsch
Copy link
Member Author

The problem seems to be that the timestamps of the files in artifacts/mibc are the original timestamps from inside the NuGet package. That means they are days in the past. Once we create the merged StandardOptimizationData.mibc, the timestamp becomes the time at which the tool was invoked. When we get newly optimized data in artifacts/mibc this still remains older than StandardOptimizationData.mibc, and thus it doesn't rerun.

Ideally the artifacts/mibc/**.mibc files would have their timestamp reset when they are placed, but that does not seem that easy to do while keeping incremental builds (it would be better if we could set the times after package restore has determined a new optimization data package was needed). Instead, I guess we can change the merged output .mibc file timestamp to be the same as the input files. Then at least dotnet-pgo will rerun automatically when one fetches newer commits (but potentially not if going back in history).

jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Jul 27, 2021
The timestamp of the merged .mibc file was set to when the tool was
invoked, while the inputs have a timestamp from when they were created
in the training scenarios. That means the target to create the merged
.mibc file would not run incrementally until many weeks later.

To fix add an --inherit-timestamp flag to dotnet-pgo that makes the
merged output inherit the max timestamp from the inputs. This is not
ideal as it means incrementally going backwards does not work, but it's
better than the previous behavior.

Fix dotnet#53637
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 27, 2021
jakobbotsch added a commit that referenced this issue Jul 28, 2021
The timestamp of the merged .mibc file was set to when the tool was
invoked, while the inputs have a timestamp from when they were created
in the training scenarios. That means the target to create the merged
.mibc file would not run incrementally until many weeks later.

To fix add an --inherit-timestamp flag to dotnet-pgo that makes the
merged output inherit the max timestamp from the inputs. This is not
ideal as it means incrementally going backwards does not work, but it's
better than the previous behavior.

Fix #53637
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 28, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Aug 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants