Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offload short range in CoulombPBCAA #3842

Merged
merged 17 commits into from
Feb 18, 2022
Merged

Conversation

ye-luo
Copy link
Contributor

@ye-luo ye-luo commented Feb 15, 2022

Review after #3839

Proposed changes

Offload short range in CoulombPBCAA. The particle range will be chopped into chunks and then compute the pair distance and short range Coulomb chunk by chunk. The reason doing is is to avoid storing AA full table on the device while a large enough chunk size should ensure heavy enough workload per offload kernel. Hard-coded size 64 should be reasonable but will need further investigation.

What type(s) of changes does this code introduce?

  • New feature

Does this introduce a breaking change?

  • No

What systems has this change been tested on?

epyc-server

Checklist

  • Yes. This PR is up to date with current the current state of 'develop'
  • Yes. Code added or changed in the PR has been clang-formatted
  • Yes. This PR adds tests to cover any new code, or to catch a bug that is being fixed
  • Yes. Documentation has been added (if appropriate)

@ye-luo
Copy link
Contributor Author

ye-luo commented Feb 16, 2022

Test this please

@ye-luo
Copy link
Contributor Author

ye-luo commented Feb 16, 2022

Test this please

@ye-luo
Copy link
Contributor Author

ye-luo commented Feb 16, 2022

Test this please

@ye-luo
Copy link
Contributor Author

ye-luo commented Feb 16, 2022

Test this please

@prckent prckent self-requested a review February 17, 2022 16:25
Copy link
Contributor

@prckent prckent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two very minor questions:

  1. I noticed a couple of pragma unrolls. Did you see any benefit when adding these? In 2022 I would hope that the compilers can make a sensible choice for a small fixed size loop.
  2. Unless codecov is confused, the coverage on the change is a little low. Are you able to easily expand the coverage to get >44%?

Copy link
Contributor

@prckent prckent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to help get through the PR backlog. Still interested in unroll & test coverage info.

@prckent prckent enabled auto-merge February 18, 2022 00:17
@prckent
Copy link
Contributor

prckent commented Feb 18, 2022

Test this please

Copy link
Contributor

@PDoakORNL PDoakORNL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks looks pretty good for me.

I made a couple of suggestions but they are just nice to have.

@@ -122,7 +124,15 @@ void HamiltonianFactory::addCoulombPotential(xmlNodePtr cur)
}
#else
if (applyPBC)
targetH->addOperator(std::make_unique<CoulombPBCAA>(*ptclA, quantum, doForces), title, physical);
{
if (use_gpu.empty())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input handling is pretty inconsistent. At a certain point it would be better to spend the time on an input class.

@@ -51,7 +51,7 @@ TEST_CASE("Coulomb PBC A-A Ewald3D", "[hamiltonian]")
LRCoulombSingleton::CoulombHandler->initBreakup(ions);


CoulombPBCAA caa = CoulombPBCAA(ions, false);
CoulombPBCAA caa(ions, false, false, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are so many flags in a row it might be worth defining an options struct. Reading all these is painful.

struct CoulombPBOptions {
  bool active;
  bool computeForces;
  bool use_offload;
}
...
CoulombPBOptions options;
options.active = false;
options.forces = false;
options.offload = false;
CoulombPBCAA caa(ions, options);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I saw computeForces labelled as deprecated. I think the list won't grow further.

@prckent prckent merged commit 4b27a06 into QMCPACK:develop Feb 18, 2022
@ye-luo
Copy link
Contributor Author

ye-luo commented Feb 18, 2022

Two very minor questions:

1. I noticed a couple of pragma unrolls. Did you see any benefit when adding these? In 2022 I would hope that the compilers can make a sensible choice for a small fixed size loop.

computeDist is simplified from computeDistancesOffload. So unroll is kept. Need to assess both places to determine the removal.

2. Unless codecov is confused, the coverage on the change is a little low. Are you able to easily expand the coverage to get >44%?

Need to run code coverage with offload to host.

@ye-luo ye-luo deleted the offload-dist branch February 18, 2022 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants