Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

World builder consuming lot of time in my models #219

Open
alarshi opened this issue Mar 3, 2021 · 20 comments
Open

World builder consuming lot of time in my models #219

alarshi opened this issue Mar 3, 2021 · 20 comments
Labels
performance Requests or proposes performance improvements

Comments

@alarshi
Copy link
Contributor

alarshi commented Mar 3, 2021

I am running 3d global mantle flow models based on a seismic tomography model. The plate boundaries are implemented as a compositional field, faults, using the worldbuilder. However, my models are running very slow (more than 10 times slower) when using the world builder. Below is the log file for a model with and without the faults (notice the time taken to set up initial conditions) :
With faults:
log_with_wb

Without faults:
log_wo_wb

I was wondering if there is a way to modify the worldbuilder implementation for this model, in order to reduce the time it takes to set up the initial conditions.

@MFraters
Copy link
Member

MFraters commented Mar 3, 2021

Hey @alarshi,

Thanks for you question! Without seeing the input file, I find it hard to see exactly why it is suddenly so much slower, but there are several things which could cause it to slow down that much.

  1. Faults are very generic features, which can do the same things as slabs (have curved angles for example). Both faults and slabs are a lot more expensive than the other features due to this flexibility. If you just want to have a fault in a continental plate which goes straight down, you can just make it a continental plate and set it to a different compositional field.
  2. For both slabs and faults (and all the other features) you can set a min depth and max depth. Everything above and below these bounding values is not checked whether the point is in the slab/fault. I suspect setting the max depth to the deepest depth at which the faults are present (crust or lithosphere) will help a bit since the world builder then doesn't have to search in the mantle whether those points are in the fault, but probably not a huge amount since it already creates a bounding box based on the total length of the feature. But it is worth a try.
  3. If the two options above do not work for you, then depending on your model the time might be considerably reduced by not only limiting the search in depth but also in the other two dimensions. This may be particularly useful if you have a lot of short faults in your model. We could take the same approach with the automatic depth bounding box since we know the maximum length and thickness of the fault (and we might even be able to use the direction, although that can be a bit tricky). See for example:
    if (depth <= maximum_depth && depth >= starting_depth && depth <= maximum_total_slab_length + maximum_slab_thickness)
    {
    (it says total_slab_length, but it should be total_fault_length of course...)

As a last note, if this is 905 seconds on your laptop where you do not really solve and/or evolve the problem, and you are going to run the model on a cluster I wouldn't worry too much about the ~900 seconds. Setting up a model with the WB in aspect requires no communication, so it should scale perfectly. This also becomes negligible if you run a model in time for several days.

I hope this helps. Let me know if the first two solutions work for you, or if you would need help with the implementation of the third solution.

Cheers,

Menno

@alarshi
Copy link
Contributor Author

alarshi commented Mar 3, 2021

Hello @MFraters, thank you for your prompt reply!

  1. I am using Bird's (2002) model for the global plate boundaries in worldbuilder. The faults in this model are either vertical or inclined at an angle of 30 or 60 degrees from horizontal. I am not sure if I could define the continental plates first, and then use the plate boundaries.
  2. Currently, I have set the maximum depth to 300km. I think this is reasonable for our models.
  3. I do not set the maximum lateral extent of the fault. I will try that.

I want to point out that our models are instantaneous and runs on the cluster with a finer mesh resolution take a similar proportion of time to set the initial conditions using worldbuilder. I want to share with you the prm file (https://www.dropbox.com/s/8u6lm0ufed2ej3h/tomograpy-based-constant-gs.prm?dl=0_ and the world builder file (https://www.dropbox.com/s/7vrt6zxak2aje8n/geodynamic_world_builder_faults.json?dl=0) I am using. Please note that the prm file would not run because it is based on a different material model and takes in several input files, which I did not share. Let me know if you would want to run the model and I will share the whole project folder.

Thank you!
Regards,
Arushi

@MFraters
Copy link
Member

MFraters commented Mar 3, 2021

Wow, that is interesting. Thanks for the information and explanation. The world builder file is 5.9 Mb (on my computer, drop box says 5.66MB, anyway it is big), which is way more than I have tested it for. To be honest, I am kind a happy to see that it at least still works :)

Looking at your aspect input file and seeing that these are lots of faults over the whole world, I think the best approach is twofold:

  1. I am assuming that you have a script which generated the world builder files. If possible, change the script so that in case that the dip is 90, just use a continental plate. That will already save a quite bit, even if you take the second appoach, but probably not enough.
  2. Implement a test to check the max boundaries of the features with the trench/fault location and the fault length and thickness. This is easy to do for a box geometry, but requires a bit of thinking for the spherical geometry since you have to (depending on you option) take the curvature into account. But even for the spherical case we can probably already get a very large speedup in your case with just a bounding box with is a bit larger than strictly needed.

The basic idea of the second point would be to automatically create a set of points where the fault is always within (so points at distance maximum_total_slab_length plus some margin, in all directions of all the points on the line). We can then use the function polygon_contains_point to check if the current point is in that box and otherwise skip it. Or, if that check needs to be even cheaper, just have 4 points which define a min and max x and y and use that as a bounding box. It will be slightly cheaper in the if statement, but it will most likely require you to check a few more points with the distance_point_from_curved_planes function.

if (depth <= maximum_depth && depth >= starting_depth && 
   depth <= maximum_total_slab_length + maximum_slab_thickness) &&
   Utilities::polygon_contains_point(bounding_polygon_coordinates, 
                                     Point<2>(natural_coordinate.get_surface_coordinates(),
                                     world->parameters.coordinate_system->natural_coordinate_system())))
{
      /// expensive function
      std::map<std::string,double> distance_from_planes =
      WorldBuilder::Utilities::distance_point_from_curved_planes(position,
                                                                 reference_point,
                                                                 coordinates,
                                                                 slab_segment_lengths,
                                                                 slab_segment_angles,
                                                                 starting_radius,
                                                                 this->world->parameters.coordinate_system,
                                                                 true,
                                                                 one_dimensional_coordinates);

...
}

If you are interested in taking this approach, let me know, then we can talk about it.

Cheers,

Menno

@alarshi
Copy link
Contributor Author

alarshi commented Mar 11, 2021

Hello @MFraters: Thank you for the detailed suggestion. I implemented the first point and observed that the time to ```setup Initial conditions `` reduced from 98-99% to about 90%-92%. I was trying to modify the code for applying the maximum lateral extent, but I couldn't figure it out. Could we discuss your approach further?

@MFraters
Copy link
Member

I implemented the first point and observed that the time to ```setup Initial conditions `` reduced from 98-99% to about 90%-92%.

Ok, that helped only a bit as expected. Whatever we do with other optimizations, this will still help.

I was trying to modify the code for applying the maximum lateral extent, but I couldn't figure it out. Could we discuss your approach further?

Sure, I send you an email to plan a meeting.

A third option, if the second option also isn't enough, is that besides calling an expensive function less, you can of course also try to make the expensive function less expensive. Since all your planes have only one segment and are straight, we might be able to optimize a bit more for that. Looking at the code I don't directly have see what could be improved in that sense, but we could profile the code to see where exactly it is spending it's time.

@bangerth
Copy link
Contributor

bangerth commented Jul 2, 2021

I've run this through valgrind's callgrind tool, which counts for every line of the program how much time is spent in it. As expected, about 75% of the overall run time is spent in the Fault::composition function, and here's what I find for where the time there is spent:

  • About 10% of that function's run time is spent in the call to WorldBuilder::Utilities::NaturalCoordinate(position, *(world->parameters.coordinate_system)) at the very top of the function. This is expensive because we do it 303M times in @alarshi 's model :-) Do all "Feature::*" classes work in natural coordinates? If that were the case, then maybe a better approach would be to do the conversion only once before we loop over all features and call this function here: https://github.com/GeodynamicWorldBuilder/WorldBuilder/blob/master/source/world.cc#L337-L339 @alarshi just has a lot of features, and they all repeatedly do the conversion.
  • The remaining time is spent figuring out whether we're close enough to a fault, specifically in the call to distance_to_fault(): https://github.com/GeodynamicWorldBuilder/WorldBuilder/blob/master/source/features/fault.cc#L489 This is expensive, and the only way to avoid this is to not have to do it for every fault every time that function is called. The way to do that is probably to compute a bounding box (in natural coordinates) for every fault (and maybe other feature) and at the top not just check whether the depth is right, but also whether lat/long are within where that fault is located. @MFraters already mentions this option above.

Bounding boxes don't have to be precise: They would just have to give us a way to exit that piece of code without expensive computations most of the time. So a "conservative" bounding box would be totally sufficient.

For your entertainment, here is a kcachegrind screenshot. The left 3/4 of the overall runtime is spent in WorldBuilder, and almost all of which is inside the light green box that corresponds to Fault::composition().
image

@MFraters
Copy link
Member

MFraters commented Jul 3, 2021

That is some great benchmarking, thanks for doing that! I am not surprised that most of the time is spend there since that is the only thing which is really queried in these models. What I am still surprised by is that this is taking so much time compared to what you are doing in aspect... The picture looks really nice, but I can't really read it since the text is cut off at the beginning so that nearly every box just says world builder ;)

About 10% of that function's run time is spent in the call to WorldBuilder::Utilities::NaturalCoordinate(position, (world->parameters.coordinate_system)) at the very top of the function. This is expensive because we do it 303M times in @alarshi 's model :-) Do all "Feature::" classes work in natural coordinates? If that were the case, then maybe a better approach would be to do the conversion only once before we loop over all features and call this function here: https://github.com/GeodynamicWorldBuilder/WorldBuilder/blob/master/source/world.cc#L337-L339 @alarshi just has a lot of features, and they all repeatedly do the conversion.

hmm, that is interesting and I can see why it is an issue with 303M features :P I took a look at the relevant parts of the code and I see no reason why not to directly provide the natural coordinates. The cartesian position will still be needed in the distance_point_from_curved_planes function (or converted later which is also not optimal), so just adding it to the temperature function would be the way to go. An other place where it can be handed over to is the distance_point_from_curved_planes which internally also computes the natural coordinates (again). So precomputing both should help a lot to bring this 10% down. These changes should be trivial to implement. They will be a breaking change for internal function and plugin systems (but not the plugins themselfs), but that is fine with me at this state of the project and the release cycle (with a proper changelog entry)

The remaining time is spent figuring out whether we're close enough to a fault, specifically in the call to distance_to_fault(): https://github.com/GeodynamicWorldBuilder/WorldBuilder/blob/master/source/features/fault.cc#L489 This is expensive, and the only way to avoid this is to not have to do it for every fault every time that function is called. The way to do that is probably to compute a bounding box (in natural coordinates) for every fault (and maybe other feature) and at the top not just check whether the depth is right, but also whether lat/long are within where that fault is located. @MFraters already mentions this option above.

Yes, I still think that calling this function less is where you should be able to get the biggest performance gain. I still haven't gotten around to do benchsmarking of the pull request myself (sorry :( ), but the automatic benchmarks show a good picture (which should even be better because the bounding box is not used for the temperature or grains yet).

Bounding boxes don't have to be precise: They would just have to give us a way to exit that piece of code without expensive computations most of the time. So a "conservative" bounding box would be totally sufficient.

I fully agree that they do not have to be precise, but they do have to be on the safe side. In cartesian coordinates that is very easy, but I am having trouble wrapping my head around what a bounding box would do in spherical coordinates around the poles and a slab goes below the poles for example. I will try to add some tests for this over the coming days to test it. I think the pull request to address this is in very good shape and thanks for adding a review!

One more thing to mention is that if more than one composition is used, it could be beneficial to develop an interface which is able to query all compositions (and other properties) at the same time. This has been in the back of my mind for a while, but I haven't gotten around to think it through, so I don't know how much work that would be. My current estimates ranges between not to difficult to a major rewrite, so that is not really helpful. But this wouldn't matter for your case if you would just use it for one composition.

@bangerth
Copy link
Contributor

bangerth commented Jul 3, 2021

It's possible that you can visualize the profiling with just the attached file. If you install the kcachegrind program, just call it like

kcachegrind callgrind.out.26325

callgrind.out.26325.gz

Then you can hover over individual boxes, see entire profiles, etc.

@MFraters
Copy link
Member

MFraters commented Jul 3, 2021

Thanks for the file, it works. Did you make this with the master version or the branch with the bounding box (#264)?

@MFraters
Copy link
Member

MFraters commented Jul 3, 2021

hmm, the minus and multiply operators for 2d points take together more than 10% of the total time. This seems to be quite high to me since they are just adding or multiplying two numbers. might be another avenue to explorer. With this file I can't see the number of calls made to the function or the source code, so I will have to spin up valgrind for myself :)

@MFraters
Copy link
Member

MFraters commented Jul 3, 2021

With the changes from @alarshi the world builder is actually spending almost 55% of the total time in computing the natural coordinates (78% of the time of time spend on the fault temperature function (where I have added the same bounding box for temperature as for composition). So, yes, pre-computing and passing the natural points together with the bounding box of #264 are certainly going to be important performance improvements. (this might be partly due to the random location and small size of the chunk I am testing it with, but still, it is unneeded)

@bangerth
Copy link
Contributor

bangerth commented Jul 4, 2021

I did my tests with the 0.4.0 release. I will gladly repeat that once #264 and #300 are merged (but I think that #264 still has bugs).

@MFraters
Copy link
Member

MFraters commented Jul 4, 2021

Ah, oke. Just, so you know, since the 0.4.0 there was another major improvement in performance made by @gassmoeller, see #289.

@gassmoeller
Copy link
Contributor

@bangerth which ASPECT version did you use for the benchmarking? geodynamics/aspect#4091 has sped up the point operators significantly, because they no longer use expensive asserts. Inlining them may help further of course.

@bangerth
Copy link
Contributor

bangerth commented Jul 5, 2021

I used the dev version from June 28, revision aspect#84876b96070d3b680d453ed5cfb1d45114f7e41a. That includes 4091.

@bangerth
Copy link
Contributor

bangerth commented Jul 6, 2021

I've got a question about the bounding box issue. If I understand things right, then @alarshi 's #264 re-computes the bounding box every time we ask for it for a given depth. This is of course expensive (sort of). It also does this in the "natural" coordinate system, and this introduces issues if a fault passes by close to or over a pole.

My intuition would actually have been to build the bounding box in 3d, and in Cartesian coordinates. This means that we can build it once, and we don't have the issue with the poles. But it may not be quite as effective if faults are long because then they are sort of straight in a spherical coordinate system, but not in a Cartesian coordinate system. So the question is this: How long are faults typically in descriptions such as those that @alarshi is using?

@MFraters
Copy link
Member

MFraters commented Jul 6, 2021

hmm, yes, that approach would have some advantages. The disadvantage would be that you would need to check for a 3d box instead of a 2d box, but if you can pre-compute the expensive calculations that can certainly be worth it.

My initial thought was that there would be the issue of not knowing your height of the model and depth of the point (that is only known at the time of the request), but if you just define the box by 4 (unit) vectors, you should be able to check if the point is between them. Note that the depth is already accounted for, so we don't care about that in this check.

I also found that the calculation of the depth limit actually has a bug which makes it far to lenient. See #305. This fix could speed up certain models considerably.

@alarshi
Copy link
Contributor Author

alarshi commented Jul 6, 2021

@bangerth: In the global fault database we are using, the faults are not very long given our model domain. The length of faults ranges from ~3 km to ~309 km, with a mean of 27 km. I can try using a 3d bounding box, will definitely need and ask for help :)

@bangerth
Copy link
Contributor

bangerth commented Jul 6, 2021

Thanks, @alarshi !

Current state, with current dev versions of both ASPECT and WB:
image

@bangerth
Copy link
Contributor

bangerth commented Jul 7, 2021

See also #306/#307.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Requests or proposes performance improvements
Projects
Status: No status
Development

No branches or pull requests

4 participants