Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory use when writing ParameterSets to files #44727

Merged
merged 1 commit into from
Apr 14, 2024

Conversation

Dr15Jones
Copy link
Contributor

PR description:

When forwarding a VPSet from the input file to the output, avoid calling vpset() which triggers creating the entire vector.

PR validation:

All framework unit tests pass. When this change was applied to a production nanoAOD merge job that was having memory problems (grew to 4GB+) the memory use remained flat (1.9GB).

fixes #44679

When forwarding a VPSet from the input file to the output, avoid
calling vpset() which triggers creating the entire vector.
@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 12, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44727/39937

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @Dr15Jones for master.

It involves the following packages:

  • FWCore/ParameterSet (core)

@makortel, @cmsbuild, @Dr15Jones, @smuzaffar can you please review it and eventually sign? Thanks.
@missirol, @makortel, @wddgit this is something you requested to watch as well.
@antoniovilela, @sextonkennedy, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-300972/38822/summary.html
COMMIT: 87322de
CMSSW: CMSSW_14_1_X_2024-04-12-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/44727/38822/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

@makortel
Copy link
Contributor

@makortel
Copy link
Contributor

Looking the max memory used in MaxMemoryPreload output for a few workflows by hand the effect is not really visible in the PR tests (changes were smaller than 0.5 %, and varied to both directions).

So either we are not testing the pathological cases in PR tests, or I didn't happen to look at them.

@makortel
Copy link
Contributor

+core

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @rappoccio, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

@antoniovilela
Copy link
Contributor

+1

@jenimal
Copy link

jenimal commented Apr 17, 2024

@Dr15Jones is this fix being backported to old cmssw releases so they work when we retry them in production or do we need to do a dirty fix on our end?

@Dr15Jones
Copy link
Contributor Author

is this fix being backported to old cmssw releases so they work when we retry them in production or do we need to do a dirty fix on our end?

Yes. see #44739 #44740 #44741

@makortel
Copy link
Contributor

@jenimal Of which CMSSW release cycles are you particularly interested in? The backports Chris mentioned were down to 13_2_X. For the necessary release cycles we'd also need new releases (@cms-sw/orp-l2), and then somehow those release would have to be used in the merge jobs (but you probably know more about how all that works).

@yanfr0818
Copy link

Hi @makortel, these WFs are using CMSSW_13_2_6_patch2, which probably is already covered by your backports.

@makortel
Copy link
Contributor

Will you be able to use a new 13_2_X release somehow in the existing workflows, or will the fix be applicable only for new workflows?

@yanfr0818
Copy link

So I was checking the CMSSW release that the existing WFs were using. When we do resubmission, it will use the same release. I don't think this is an issue since CMSSW_13_2_6_patch2 is included by 13_2_X, right? So we don't need to use a new release from 13_2_X.

@makortel
Copy link
Contributor

13_2_6_patch2 does not include the backport (well, given that the PR was merged yesterday, none of the 13_2_X releases have the backport). In order to have the fix, we need to build a new 13_2_X release, and somehow the necessary workflows need to use the new release.

@yanfr0818
Copy link

So how a backport works is that, if a fix is backported to a specific set of releases, the fix is only applied to any new release of that set, but not the existing releases?

I'm not sure how viable is to change the CMSSW release setting in a job configuration file. I'll need to find out.

I'm a bit concerned about the future merge jobs. If we build a new 13_2_X release, will the future WFs use it instead of the old release? Otherwise such issue will still emerge.

@makortel
Copy link
Contributor

So how a backport works is that, if a fix is backported to a specific set of releases, the fix is only applied to any new release of that set, but not the existing releases?

Correct. All already-built releases are immutable.

If we build a new 13_2_X release, will the future WFs use it instead of the old release?

That is exactly the question I'm wondering too.

@yanfr0818
Copy link

Hi @makortel. When we backported the fix to a specific CMSSW release, do we increment the version number and create a new release version? If so, what is the new release number? Sorry if I have missed it.

@makortel
Copy link
Contributor

When we backported the fix to a specific CMSSW release, do we increment the version number and create a new release version? If so, what is the new release number?

When a new release in a release cycle is built, the version number is indeed increased. But a new release is built only when really needed (i.e. not after every backport), so you didn't miss anything. If I'm not mistaken the next version in 13_2_X would be CMSSW_13_2_11.

@yanfr0818
Copy link

But a new release is built only when really needed (i.e. not after every backport), so you didn't miss anything.

Fair enough. A new release is built when we accumulate enough changes. Is that also when the update is propagated into CVMFS? After that, future workflows can use the new release like CMSSW_13_2_11.

@makortel
Copy link
Contributor

A new release is built when we accumulate enough changes. Is that also when the update is propagated into CVMFS? After that, future workflows can use the new release like CMSSW_13_2_11.

Correct. I can ask for a new 13_2_X release in the release planning meeting next week (advance warning to @cms-sw/orp-l2).

Does the "future workflows can use the new release" need action from PdmV? (let's tag them in any case @cms-sw/pdmv-l2)

@AdrianoDee
Copy link
Contributor

Does the "future workflows can use the new release" need action from PdmV? (let's tag them in any case @cms-sw/pdmv-l2)

I think the action from our side would be to be sure to use the "cured" releases for any future submission. And eventually ask for a backport of this in case we understand we need it for an older release.

@yanfr0818
Copy link

And eventually ask for a backport of this in case we understand we need it for an older release.

If backporting it to an older release is allowed, can we do it for 13_2_6_patch2?
We are trying to resolve some failed WFs using that release. If it's not possible, we will have to wait for a new release like CMSSW_13_2_11, and do CMSSW + ScramArch override in the resubmission.

@Dr15Jones
Copy link
Contributor Author

@yanfr0818 I’m afraid releases are immutable once published so the jobs would have to switch to the new release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate memory usage increase for merge jobs
7 participants