Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul region map implementation #184

Merged
merged 4 commits into from
Jun 27, 2023
Merged

Conversation

psalz
Copy link
Member

@psalz psalz commented Jun 22, 2023

The region_map is core to many of Celerity's functionalities, as it enables precise tracking of data locations and other things with single-element granularity. Unfortunately the implementation we have used thus far was rather naive and does not perform well when the number of entries gets large (horizons made this less of a problem, but it is still a problem). This PR completely overhauls the region_map implementation, making it a lot... cooler 😎

The new implementation is based on an R-Tree [Guttman 1984] data structure. This is basically a hierarchy of n-dimensional rectangles that allow for O(log N) average query performance (since rectangles can overlap the worst case complexity is still O(N)). This implementation is loosely based (if I remember correctly, it's been a while) on the original Guttman paper, and uses the quadratic split approach (or a hybrid of quadratic seed + linear assignment, in some places).

The region map is different from a regular R-Tree for several reasons:

  • It has a fixed extent which is always completely filled with entries (initially there is a single box with a default value spanning the entire extent).
  • Entries (= a box and a value) can not overlap.
  • Entries with the same value are merged if possible (if they share an edge such that together they form a new box).
  • Querying the region map returns all entries intersecting with the query box. All boxes in the result set are clamped to the query box and merged, if possible (two boxes that could not be merged upon inserting into the tree directly could potentially be merged after clamping!).

The region map also implements a couple of optimizations to better support the aforementioned use cases:

  • If an update operation's box exactly matches with a box already in the tree, the value is updated in-place (no removals or insertions necessary).
  • Updating a value within the region map may require existing boxes to be split into smaller ones (imagine a large 2D box in which a smaller box is placed, the large box will have to be split into 4 pieces surrounding the new box). This results in several erase and insert operations that normally need to be propagated to the root and dispatched from there. If a subtree determines that it can fit all of the new boxes however, it performs a localized update instead.

There are several locations where I suspect some performance could be gained, but I didn't have time to test these and/or thoroughly benchmark them. I therefore opted to keep everything (mostly) as in the version used for CCGrid23, and marked all such locations with TODO PERF comments.

The current implementation assumes that stored values are cheap to copy (this happens frequently during update and query operations). Ways of reducing the number of copies could be investigated, however with upcoming changes to command generation IIRC we no longer store any std::vectors or other larger objects within region maps.

I have some benchmarks that are not included in this PR, as they need more work. In any case, the new implementation is a pretty obvious improvement over the existing one:

rm_old_vs_new

@psalz psalz requested review from PeterTh and fknorr June 22, 2023 21:37
Copy link
Contributor

@fknorr fknorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without too much understanding of R-trees, I would argue that this implementation has seen enough usage in tests to be seen as viable.

In addition to my individual remarks:

  • Do we want to keep all the experimental defines / flags or can we settle on the current configuration and simplify the code by removing some #ifs?
  • Celerity uses int Dims for template parameters everywhere, so the region map should follow suit (also constexpr static dimensions members)
  • Since I have an irrational distaste for iostreams, here's a patch that migrates the new region map to fmt output iterators:
diff --git a/include/region_map.h b/include/region_map.h
index 6f1ca6b1..f301ac7a 100644
--- a/include/region_map.h
+++ b/include/region_map.h
@@ -737,6 +710,29 @@ namespace region_map_detail {
 			return m_children.size();
 		}

+		auto format_to(fmt::format_context::iterator out, const size_t level) const {
+			const auto padding = std::string(2 * level, ' ');
+			auto bounding_box = get_bounding_box();
+			if(!m_contains_leaves) {
+				out = fmt::format_to(out, "{}inner node with bbox {} and {} children:\n", padding, bounding_box, m_children.size());
+				for(size_t i = 0; i < m_children.size(); ++i) {
+					out = get_child_node(i).format_to(out, level + 1);
+				}
+			} else {
+				out = fmt::format_to(out, "{}leaf node with bbox {} and {} values:\n", padding, bounding_box, m_children.size());
+				const auto value_padding = std::string(2 * (level + 1), ' ');
+				for(size_t i = 0; i < m_children.size(); ++i) {
+					auto& v = get_child_value(i);
+					out = fmt::format_to(out, "{}{} : ", value_padding, m_child_boxes[i]);
+					if constexpr(fmt::is_formattable<ValueType>::value) {
+						out = fmt::format_to(out, "{}\n", v);
+					} else {
+						out = fmt::format_to(out, "(value not printable)\n");
+					}
+				}
+			}
+			return out;
+		}
+
 	  private:
 		template <typename RegionMap>
 		friend void sanity_check_region_map(const RegionMap& rm);
@@ -816,29 +812,6 @@ namespace region_map_detail {
 #endif
 			return {};
 		}
-
-		void print(std::ostream& os, size_t level = 0) const {
-			const auto padding = std::string(2 * level, ' ');
-			auto bounding_box = get_bounding_box();
-			if(!m_contains_leaves) {
-				fmt::print(os, "{}inner node with bbox {} and {} children:\n", padding, bounding_box, m_children.size());
-				for(size_t i = 0; i < m_children.size(); ++i) {
-					get_child_node(i).print(os, level + 1);
-				}
-			} else {
-				fmt::print(os, "{}leaf node with bbox {} and {} values:\n", padding, bounding_box, m_children.size());
-				const auto value_padding = std::string(2 * (level + 1), ' ');
-				for(size_t i = 0; i < m_children.size(); ++i) {
-					auto& v = get_child_value(i);
-					fmt::print(os, "{}{} : ", value_padding, m_child_boxes[i]);
-					if constexpr(fmt::is_formattable<ValueType>::value) {
-						fmt::print(os, "{}\n", v);
-					} else {
-						os << "(value not printable)\n";
-					}
-				}
-			}
-		}
 	};

 	inline void assert_dimensionality(const GridBox<3>& box, const int dims) {
@@ -1064,9 +1037,9 @@ namespace region_map_detail {
 			return results_merged;
 		}

-		friend std::ostream& operator<<(std::ostream& os, const region_map_impl& crm) {
-			crm.print(os);
-			return os;
+		auto format_to(fmt::format_context::iterator out) const {
+			out = fmt::format_to(out, "Region Map\n");
+			return m_root->format_to(out, 0);
 		}

 		range<Dims> get_extent() const { return grid_box_to_subrange(m_extent).range; }
@@ -1245,11 +1218,6 @@ namespace region_map_detail {
 			m_merge_candidates = std::move(merge_candidates);
 		}

-		void print(std::ostream& os) const {
-			os << "Region Map\n";
-			m_root->print(os, 0);
-		}
-
 		// Returns the number of leaves (unique box/value combinations, values may be duplicated) in the tree.
 		// NB: Debug utility, NOT cheap
 		size_t count_leaves() const { return m_root->count_leaves(); }
@@ -1392,6 +1360,14 @@ class region_map {
 		}
 	}

+	auto format_to(fmt::format_context::iterator out) const {
+		switch(m_dims) {
+			case 1: return get_map<1>().format_to(out);
+			case 2: return get_map<2>().format_to(out);
+			case 3: return get_map<3>().format_to(out);
+		}
+	}
+
   private:
 	int m_dims;
 	std::variant<std::monostate, region_map_detail::region_map_impl<ValueType, 0>, region_map_detail::region_map_impl<ValueType, 1>,

include/buffer_manager.h Show resolved Hide resolved
include/region_map.h Outdated Show resolved Hide resolved
include/region_map.h Outdated Show resolved Hide resolved
include/region_map.h Outdated Show resolved Hide resolved
include/region_map.h Outdated Show resolved Hide resolved
include/region_map.h Outdated Show resolved Hide resolved
include/region_map.h Show resolved Hide resolved
include/region_map.h Outdated Show resolved Hide resolved
include/region_map.h Outdated Show resolved Hide resolved
include/region_map.h Show resolved Hide resolved
@psalz
Copy link
Member Author

psalz commented Jun 23, 2023

Thanks for the review, @fknorr!

Do we want to keep all the experimental defines / flags or can we settle on the current configuration and simplify the code by removing some #ifs?

I'd like to think that at some point I'll come back and do some more tuning, but I can also keep that version around on another branch (together with the benchmarks that I haven't included in this PR).

Celerity uses int Dims for template parameters everywhere, so the region map should follow suit (also constexpr static dimensions members)

The reason I'm currently using size_t is because that is what the AllScale boxes use, and constantly converting between types is a pain.

Since I have an irrational distaste for iostreams, here's a patch that migrates the new region map to fmt output iterators:

Sorry about that, the implementation predates your crusade. I've commited your patch separately (and made you the author).

test/graph_compaction_tests.cc Outdated Show resolved Hide resolved
include/region_map.h Show resolved Hide resolved
include/region_map.h Outdated Show resolved Hide resolved
include/region_map.h Show resolved Hide resolved
test/CMakeLists.txt Show resolved Hide resolved
test/CMakeLists.txt Outdated Show resolved Hide resolved
@fknorr
Copy link
Contributor

fknorr commented Jun 25, 2023

I'd like to think that at some point I'll come back and do some more tuning, but I can also keep that version around on another branch (together with the benchmarks that I haven't included in this PR).

Let's not overcomplicate things and keep them in for a future revision.

The reason I'm currently using size_t is because that is what the AllScale boxes use, and constantly converting between types is a pain.

Fine, then these should be replaced once we get a custom range / box implementation.

@psalz psalz force-pushed the upstream/rtree-region-map branch from 3217590 to ff37f1e Compare June 26, 2023 08:53
@psalz psalz force-pushed the upstream/rtree-region-map branch from 5d55196 to 1fc6d78 Compare June 26, 2023 23:24
@github-actions
Copy link

Check-perf-impact results: (9dccbe94fb4adf9f6271b8f650ad9210)
❓ No new benchmark data submitted. ❓
Please re-run the microbenchmarks and include the results if your commit could potentially affect performance.

@celerity celerity deleted a comment from github-actions bot Jun 27, 2023
@psalz psalz force-pushed the upstream/rtree-region-map branch from 972ffcc to b0d3f9b Compare June 27, 2023 12:59
@psalz psalz changed the base branch from master to 0.4.0-alpha June 27, 2023 12:59
@github-actions
Copy link

Check-perf-impact results: (0662ca46677f79d01164d7536818f873)

⚠️ Significant slowdown in some microbenchmark results: benchmark intrusive graph dependency handling with N nodes - 100 / checking for dependencies
🚀 Significant speedup in some microbenchmark results: 57 individual benchmarks affected

Overall relative execution time: 0.73x (mean of relative medians)

psalz and others added 3 commits June 27, 2023 17:03
This replaces the existing region_map class with a new implementation
based on an R-Tree, which greatly improves update and query performance.
...replacing std::ostream& operator<<
@psalz psalz force-pushed the upstream/rtree-region-map branch from b0d3f9b to 026e6b4 Compare June 27, 2023 15:04
@psalz psalz merged commit d9bb361 into 0.4.0-alpha Jun 27, 2023
@psalz psalz deleted the upstream/rtree-region-map branch June 27, 2023 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants