Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Speed up the uniform mesh converter #991

Closed
keckler opened this issue Nov 21, 2022 · 9 comments · Fixed by #1042
Closed

Request: Speed up the uniform mesh converter #991

keckler opened this issue Nov 21, 2022 · 9 comments · Fixed by #1042
Labels
feature request Smaller user request optimization related to measuring and speeding up the code or reducing memory

Comments

@keckler
Copy link
Member

keckler commented Nov 21, 2022

The uniform mesh converter can sometimes take a very long time. However the majority of what it does is loop over all the assemblies in the core and assign parameters back and forth between blocks.

Calls to setAssemblyStateFromOverlaps are all completely independent and could benefit from parallel execution. Based on looking at the uniform mesh generator, this should cover the majority of the work in that class and I think could speed up the conversion process a ton.

@john-science john-science added the feature request Smaller user request label Nov 22, 2022
@ntouran
Copy link
Member

ntouran commented Nov 22, 2022

Can we at least run the operation through the profiler first to see exactly what the slow bottlenecks are with precision instrumentation? Then we can see if there's any superlow hanging fruit before parallelizing.

@ntouran ntouran added the optimization related to measuring and speeding up the code or reducing memory label Nov 22, 2022
@keckler
Copy link
Member Author

keckler commented Nov 22, 2022

Yes, a very good idea! This was just based on the keckler-profiler, aka my eyeballs looking at the exploding runtimes.

@jakehader
Copy link
Member

jakehader commented Nov 22, 2022

Can we at least run the operation through the profiler first to see exactly what the slow bottlenecks are with precision instrumentation? Then we can see if there's any superlow hanging fruit before parallelizing.

I did this months ago but didn'tseem to write it down. It is slow on getting and setting number densities for each block in the core. I feel like this suggestion keeps getting made. @john-science did timing tests of this already a couple months ago too.

See: #721

@keckler
Copy link
Member Author

keckler commented Nov 28, 2022

It looks like the uniform mesh converter may be getting a proper overhaul in the near future, so if this is implemented, it should be a part of that larger work.

@john-science
Copy link
Member

john-science commented Dec 5, 2022

A good process for speeding up code goes something like:

  1. Profile the code.
  2. Review the code profile and see where the slow-downs are and what expensive operations are being called inside nested loops that can easily be improved.
  3. See what data structures are being used.
  4. See if we can lean more on NumPy versus pure Python
  5. and on and on and on...
  6. Finally, and Very Last: parallelizing the code.

It is not appropriate to say "we need to parallelize this code" before you've done any work to improve it's performance.

@john-science
Copy link
Member

Also, and not for nothing, you want to parallelize the uniform mesh converter? Even though the ARMI model is already run in parallel?

You can't spawn parallel code INSIDE code that's already parallel.

Is the uniform mesh converter run outside the already parallel code in the ARMI interface loop?

@mgjarrett
Copy link
Contributor

mgjarrett commented Dec 19, 2022

I'm still experimenting with the profiler, but here are some preliminary results. I ran a script that loads up a reactor, does a uniform mesh conversion and then de-conversion. 50% of the overall run time is spent in makeAssemWithUniformMesh, with most of that being spent in this block.deepcopy():

image

I don't think we really need a deepcopy here. We just want a new block that has the same name and as the original. We're going to overwrite everything else (height, number densities, parameters, etc.) We could probably cut down a good chunk of this overhead if we avoid a deepcopy.

Note: This was a small test reactor; the proportion of run time in each function call might change when we run on a full-size model.

@mgjarrett
Copy link
Contributor

I implemented a method for instantiating a new block that's much lighter than a deepcopy. This reduced the time spent on creating new blocks by an order of magnitude, and now the long pole in the tent is parameterDefinitions.__getitem__.

In a test case where makeAssemWithUniformMesh is called 155 times (for 155 assemblies), parameterDefinitions.__getitem__ is called 2,523,029 times.

image

The expensive line is this one, which walks through all of the parameters defined in the collection and performs a string comparison to the name that was passed in to the function.

matches = [pd for pd in self if pd.name == name]

I think we can cut run time down by at least an order of magnitude if this can be converted to a hashed lookup. I opened #1039 to track developments on that front.

@mgjarrett mgjarrett mentioned this issue Dec 20, 2022
7 tasks
@keckler keckler changed the title Request: Implement the uniform mesh converter in parallel Request: Speed up the uniform mesh converter Dec 22, 2022
@john-science
Copy link
Member

john-science commented Dec 27, 2022

Great progress @mgjarrett !

Removing a deepcopy() is usually a win. In most languages, doing a "deep copy" of an object is an expensive operation. So, that makes good sense.

Also, I like that you found one line that was costing a huge portion of the run time. You'd be surprised how often that's true.

You're burying the lead though. I've always here it was the number densities that were the slowest part of ARMI, specifically the meshing. So... you did an actual profile and found otherwise. That's great! Way to use analytical data, instead of just trusting the standard wisdom. You rock.

@keckler keckler linked a pull request Jan 13, 2023 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Smaller user request optimization related to measuring and speeding up the code or reducing memory
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants