-
Notifications
You must be signed in to change notification settings - Fork 11
CarpetX: Interpolation setup #369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Ticket is here |
CarpetX/src/interpolate.cxx
Outdated
| } | ||
| if (need_redistribute) | ||
| break; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redistribute() places every particle into the correct (level, grid, tile, rank) bin, so its positions and field tiles line up.
Here level=0 so it's consistent. And you checked grid and rank are consistent. However, you didn't check tile.
On the other hand, the tiling strategy is not likely to change during a simulation. It might not be an issue. It can change if you start from a checkpoint and use a different tiling strategy in the parfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you will Redistribute() anyways if you restart from a checkpoint. So you should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lwJi thanks for taking a look!
I have added some new code to check that tiles are consistent
|
I like this idea. We did the same in Carpet. Instead of using an automatic cache I recommend splitting the interpolation function into two. The first only sets up things, without interpolating anything, and returns an "interpolation setup". The second function would take an interpolation setup as input and perform the interpolation. In this way there is no automatic mechanism and no checking needed. In Carpet we also have a "world age" which increases whenever the grid setup changes. The world age is stored in the interpolation setup. When the world age does not match then the interpolation setup is recreated automatically. |
|
I'm failing to see how an Once I understand the difference I can begin working on that. It will probably be significantly different than the current status of this PR though (I think I will have to basically start from scratch), so maybe change it to a draft? |
|
The main difference is that an interpolation setup can be used right away by the interpolator, it does not need to search the cache, and does not need to ensure that the cache doesn't overflow. |
|
Ok, so I hadn't seen Roland's bitbucket comment, sorry about that. I need to look at But I think it would be useful (for my understanding) if I could get a higher level understanding for what you would like to have so I can plan how the architecture of this would work.
If the above is correct, then I think I would know how to get started. But I would also say that this is still a cache and is still automatic, in some sense, even though I understand why it is different than what I did and how simpler the |
|
Converted to draft while I rework the implementation |
28bbde7 to
079f9bb
Compare
|
Based on feedback, I've significantly changed this PR. I think the caching change is complicated and we ought to do it in phases. In this PR, I'm implementing the simplest change: I've split
Currently, I've changed |
Overview
This PR provides two optimizations for CarpetX's interpolator, based on caching:
I expect this optimization to be important for multipatch runs that don't change their grids very often (or at all)
Benchmarks
These are Cactus timer based benchmarks that measure
Multipatch1_Interpolate, which indirectly callsCarpetX_Interpolate.The benchmark used runs for 4 BSSN iterations with 7 levels of mesh refinement and no regrinding. It uses 8 MPI processes and 5 OpenMP threads per process.
Cactus timer results for benchmark before optimization.
Cactus timer results for benchmark after optimization
Speedups using avg. time (old / new):