Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternate polygonToCells algorithm #785

Merged
merged 47 commits into from
Oct 31, 2023

Conversation

nrabinowitz
Copy link
Collaborator

@nrabinowitz nrabinowitz commented Oct 3, 2023

This PR implements an alternate algorithm for polygonToCells, which provides polygonToCompactCells as a byproduct.

Goals

Algorithm Overview

let cell = base cell 0
while cell:
  res = getResolution(cell)
  if res == target res:
     if cell is in polygon:
        set output to cell and return
  if res < target res:
     let bbox = boundingBox(cell) // where bbox is padded to contain all descendants
     if bbox intersects polygon:
        if bbox is contained by polygon:
          set output to cell and return
        cell = first child of cell
        continue
  cell = next cell // where "next" is the next sibling, or next sibling of parent, etc
when cell is null we're done

Results

  • This PR includes an iterator-based algo with full test parity to the current algorithm. In its iterator form, it takes very little memory and that memory does not scale with the size of the output - so we can stream cells out of a very large polyfill process without worrying about memory issues.
  • The new algo gives compact (or nearly compact) output by default, so we get polygonToCellsCompact effectively for free.
  • The performance isn't exactly better or worse than the current algorithm - it just has different characteristics. In benchmarks, it's faster than the current algorithm for large numbers of cells (because we can use a small number of compact parent cells) but slower for large numbers of polygon vertices (because the containment check scales with the number of vertices). See benchmarks below.
  • The new polyfill modes are not included in this PR, but should be simple to add based on the work done here (particularly cellBoundaryInsidePolygon and cellBoundaryCrossesGeoLoop).

For Discussion

  • Are the benefits sufficient to replace the existing polygonToCells algo, even if performance degrades in some cases?
  • In order to get the memory benefits, consumers need to use the iterator directly, or we need to implement some sort of batch mode that can fill chunks of memory provided by the caller (might be worthwhile to avoid costs in bindings associated with an FFI per output cell). Which option is better? What if any clean up do we need in the iterator interface if we want to make it part of the public API? Should we make the cellToChildren iterator public as well?

Benchmarks

In the following benchmakrks, SF is a 6-vertex polygon covering San Francisco, Alameda is a 49-vertex polygon covering a similar area, and SouthernExpansion is a 22-vertex polygon covering an area 2x-3x larger than SF.

  • The old polygonToCells performs better when the ratio of output cells to vertexes is low (few cells, complex polygons)
  • The new polygonToCells2 performs better when the ratio of output cells to vertexes is high (many cells, simple polygons)
  • Weirdly, the polygonToCellsCompact function, which is just polygonToCells2 without the uncompact step, performs about the same as polygonToCells2 🤔. I went over this several times, but it seems like the benchmark is correct?

Res 9

Built target benchmarkPolygon
	-- polygonToCellsSF: 2264.810000 microseconds per iteration (500 iterations)
	-- polygonToCellsSF2: 3530.106000 microseconds per iteration (500 iterations)
	-- polygonToCellsSFCompact: 3530.478000 microseconds per iteration (500 iterations)
	-- polygonToCellsAlameda: 3182.120000 microseconds per iteration (500 iterations)
	-- polygonToCellsAlameda2: 5960.740000 microseconds per iteration (500 iterations)
	-- polygonToCellsAlamedaCompact: 6027.128000 microseconds per iteration (500 iterations)
	-- polygonToCellsSouthernExpansion: 103521.500000 microseconds per iteration (10 iterations)
	-- polygonToCellsSouthernExpansion2: 112462.900000 microseconds per iteration (10 iterations)
	-- polygonToCellsSouthernExpansionCompact: 104374.600000 microseconds per iteration (10 iterations)

Res 10

Built target benchmarkPolygon
	-- polygonToCellsSF: 12268.826000 microseconds per iteration (500 iterations)
	-- polygonToCellsSF2: 11435.998000 microseconds per iteration (500 iterations)
	-- polygonToCellsSFCompact: 11363.694000 microseconds per iteration (500 iterations)
	-- polygonToCellsAlameda: 15037.306000 microseconds per iteration (500 iterations)
	-- polygonToCellsAlameda2: 25583.798000 microseconds per iteration (500 iterations)
	-- polygonToCellsAlamedaCompact: 25478.888000 microseconds per iteration (500 iterations)
	-- polygonToCellsSouthernExpansion: 693673.900000 microseconds per iteration (10 iterations)
	-- polygonToCellsSouthernExpansion2: 672981.900000 microseconds per iteration (10 iterations)
	-- polygonToCellsSouthernExpansionCompact: 659994.200000 microseconds per iteration (10 iterations)

Res 11

Built target benchmarkPolygon
	-- polygonToCellsSF: 81978.584000 microseconds per iteration (500 iterations)
	-- polygonToCellsSF2: 50297.930000 microseconds per iteration (500 iterations)
	-- polygonToCellsSFCompact: 50223.350000 microseconds per iteration (500 iterations)
	-- polygonToCellsAlameda: 96910.940000 microseconds per iteration (500 iterations)
	-- polygonToCellsAlameda2: 140856.702000 microseconds per iteration (500 iterations)
	-- polygonToCellsAlamedaCompact: 140873.162000 microseconds per iteration (500 iterations)
	-- polygonToCellsSouthernExpansion: 5245777.500000 microseconds per iteration (10 iterations)
	-- polygonToCellsSouthernExpansion2: 4613408.400000 microseconds per iteration (10 iterations)
	-- polygonToCellsSouthernExpansionCompact: 4607957.500000 microseconds per iteration (10 iterations)

@nrabinowitz nrabinowitz marked this pull request as draft October 3, 2023 23:45
src/h3lib/include/polyfill.h Outdated Show resolved Hide resolved
src/h3lib/include/polyfill.h Outdated Show resolved Hide resolved
src/h3lib/lib/polyfill.c Outdated Show resolved Hide resolved
src/h3lib/lib/polyfill.c Outdated Show resolved Hide resolved
Comment on lines +300 to +476
// If we make it out of the loop, we're done
iterDestroyPolygonCompact(iter);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what I think of the iterator usually being destroyed for you. On one hand it is convenient, on the other hand the caller will always need to be able to destroy the iterator themselves (e.g. if we exceed their memory or time budget). I guess that only happens if canceling the iteration, so that's how the caller should think of it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a reasonable way to think about it. Perhaps instead of calling it a destroyIterator or etc, it's cancelIterator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 I like destroy, as it clearly indicates "release memory" - we use it for destroyLinkedMultiPolygon as well. I went back and forth about doing this for the caller, but prefer this option:

  • In general, the component allocating memory should be responsible for releasing it, and in this case that's the iterator functions
  • This is more ergonomic for the most common use cases, and ensures proper release in those cases, reducing the potential for bugs

src/h3lib/lib/polyfill.c Outdated Show resolved Hide resolved
iterDestroyPolygonCompact(&(iter->_cellIter));
iter->cell = H3_NULL;
iter->error = E_SUCCESS;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also zero out the child iterator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Comment on lines +405 to +591
for (; iter.cell; iterStepPolygon(&iter)) {
out[i++] = iter.cell;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also assert we don't go over the estimation size here (which would be an out of bounds write)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'd want this in production code, since we assume the caller has sized the memory correctly, but the *out array could be any size, so our assertion would only be useful as an internal check that we were meeting our contract. But I think can use the assertion macros to check this in tests.

// Check if the cell is in the polygon
// TODO: Handle other polyfill modes here
LatLng center;
H3_EXPORT(cellToLatLng)(cell, &center);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should have a NEVER check for the error

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add

Comment on lines 70 to 74
err = H3_EXPORT(latLngToCell)(&NORTH_POLE, H3_GET_RESOLUTION(cell),
&poleTest);
if (NEVER(err != E_SUCCESS)) {
return err;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like Clang warnings want these stores to be to different variables https://github.com/uber/h3/actions/runs/6408852991/job/17398764952?pr=785 rather than a reused err variable. Not 100% sure I agree (since this is really complaining about something that only affects the coverage build) although I think it is marginally clearer than reusing err.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will update - I thought I was being clever by declaring this at the top of each function that could throw, but clearly not.

Comment on lines 174 to 177
test = ((b2->lat - b1->lat) * (a1->lng - b1->lng) -
(b2->lng - b1->lng) * (a1->lat - b1->lat)) /
((b2->lng - b1->lng) * (a2->lat - a1->lat) -
(b2->lat - b1->lat) * (a2->lng - a1->lng));
Copy link
Collaborator

@isaacbrodsky isaacbrodsky Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In CI, testPolygonInternal hits an unguarded division by zero on this line:

TEST polygonInternal
/Users/isaac/oss/h3/src/h3lib/lib/polygon.c:175:56: runtime error: division by zero
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Users/isaac/oss/h3/src/h3lib/lib/polygon.c:175:56 in
......................................................................................zsh: abort      ./bin/testPolygonInternal

(to configure: cmake -DCMAKE_C_FLAGS="-fsanitize=undefined,float-divide-by-zero -fno-sanitize-recover=undefined,float-divide-by-zero" .. - must be using clang)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Will fix

Comment on lines 180 to 183
test = ((a2->lat - a1->lat) * (a1->lng - b1->lng) -
(a2->lng - a1->lng) * (a1->lat - b1->lat)) /
((b2->lng - b1->lng) * (a2->lat - a1->lat) -
(b2->lat - b1->lat) * (a2->lng - a1->lng));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, presumably this could result in division by zero

@nrabinowitz
Copy link
Collaborator Author

With the various optimizations I've added so far, I've closed the gaps in the original benchmarks, and I think the new algorithm is likely to be faster most of the time, though still slower as the ratio of vertexes to cells increases (numbers shown below are ratio of old / new):

Res 9

Original:
  Simple: 0.64x
  Complex: 0.53x
  Large: 0.92x

Now:
  Simple: 1.00x
  Complex: 0.81x
  Large: 1.3x

Res 10

Original:
  Simple: 1.07x
  Complex: 0.58x
  Large: 1.03x

Now:
  Simple: 1.56x
  Complex: 0.83x
  Large: 1.36x

@nrabinowitz nrabinowitz marked this pull request as ready for review October 10, 2023 18:05
@nrabinowitz
Copy link
Collaborator Author

Some more benchmarking data - when transpiled into h3-js, the performance looks almost exactly the same, so close that for a while I was worried I wasn't testing correctly:

polygonToCells_all_countries_0 x 3.98 ops/sec ±9.40% (30 runs sampled)
polygonToCells2_all_countries_0 x 3.78 ops/sec ±11.69% (30 runs sampled)
polygonToCells_all_countries_1 x 0.74 ops/sec ±3.42% (21 runs sampled)
polygonToCells2_all_countries_1 x 0.75 ops/sec ±1.33% (21 runs sampled)
polygonToCells_all_countries_2 x 0.53 ops/sec ±2.04% (21 runs sampled)
polygonToCells2_all_countries_2 x 0.55 ops/sec ±1.08% (21 runs sampled)
polygonToCells_all_countries_3 x 0.20 ops/sec ±10.64% (20 runs sampled)
polygonToCells2_all_countries_3 x 0.20 ops/sec ±2.62% (20 runs sampled)
polygonToCells_all_countries_4 x 0.04 ops/sec ±4.17% (20 runs sampled)
polygonToCells2_all_countries_4 x 0.04 ops/sec ±3.42% (20 runs sampled)
polygonToCells_FRANCE_0 x 36.45 ops/sec ±140.62% (28 runs sampled)
polygonToCells2_FRANCE_0 x 30.89 ops/sec ±151.18% (30 runs sampled)
polygonToCells_FRANCE_1 x 26.86 ops/sec ±133.33% (74 runs sampled)
polygonToCells2_FRANCE_1 x 60.46 ops/sec ±61.84% (73 runs sampled)
polygonToCells_FRANCE_2 x 38.19 ops/sec ±78.95% (55 runs sampled)
polygonToCells2_FRANCE_2 x 73.25 ops/sec ±0.60% (56 runs sampled)
polygonToCells_FRANCE_3 x 21.73 ops/sec ±112.03% (43 runs sampled)
polygonToCells2_FRANCE_3 x 27.79 ops/sec ±82.78% (51 runs sampled)
polygonToCells_FRANCE_4 x 21.06 ops/sec ±1.18% (54 runs sampled)
polygonToCells2_FRANCE_4 x 22.21 ops/sec ±0.77% (56 runs sampled)

The all_countries suite uses multipolygons for the 258 countries in the Natural Earth 1:110m vector data, which I figured should offer a wide range of shapes and sizes. I wasn't able to benchmark past res 4, but performance looks almost identical for this dataset. Looking at an individual country like France highlights some differences (in this case, the new version performs slightly better), but overall it's a wash.

Given the memory benefits, I'm considering this a win, though I was hoping it would actually be faster 🤷.

@nrabinowitz
Copy link
Collaborator Author

Even more benchmarking here (now in C) suggests that the resolution of the output cells also makes a difference, which makes sense given the hierarchical search involved in the new algo. This means that the performance of the new algo decreases compared to the old algo as the res gets finer. The following benchmark is for Djibouti:

res 5
  -- polygonToCells_1: 15230.800000 microseconds per iteration (5 iterations)
  -- polygonToCells_2: 14093.200000 microseconds per iteration (5 iterations)
res 6
  -- polygonToCells_1: 46217.600000 microseconds per iteration (5 iterations)
  -- polygonToCells_2: 87146.400000 microseconds per iteration (5 iterations)
res 7
  -- polygonToCells_1: 222787.600000 microseconds per iteration (5 iterations)
  -- polygonToCells_2: 552836.200000 microseconds per iteration (5 iterations)
res 8
  -- polygonToCells_1: 1393634.200000 microseconds per iteration (5 iterations)
  -- polygonToCells_2: 3721000.000000 microseconds per iteration (5 iterations)
res 9
  -- polygonToCells_1:  8811737.600000 microseconds per iteration (5 iterations)
  -- polygonToCells_2: 28252274.800000 microseconds per iteration (5 iterations)

By res 9 it's 3x worse than the current implementation 😢

@nrabinowitz
Copy link
Collaborator Author

nrabinowitz commented Oct 12, 2023

Further investigation shows that there's something else going on, rather than just resolution, but I'm not sure what.

The benchmark for all countries has the old algo better after res 5, but I can't run it for finer resolutions:

Res 0
	-- polygonToCellsAllCountries_1: 1846956.400000 microseconds per iteration (5 iterations)
	-- polygonToCellsAllCountries_2: 147822.200000 microseconds per iteration (5 iterations)
Res 1
	-- polygonToCellsAllCountries_1: 3200132.800000 microseconds per iteration (5 iterations)
	-- polygonToCellsAllCountries_2: 1224828.200000 microseconds per iteration (5 iterations)
Res 2
	-- polygonToCellsAllCountries_1: 2814667.400000 microseconds per iteration (5 iterations)
	-- polygonToCellsAllCountries_2: 1650428.200000 microseconds per iteration (5 iterations)
Res 3
	-- polygonToCellsAllCountries_1: 5786302.800000 microseconds per iteration (5 iterations)
	-- polygonToCellsAllCountries_2: 3390271.200000 microseconds per iteration (5 iterations)
Res 4
	-- polygonToCellsAllCountries_1: 14079108.200000 microseconds per iteration (5 iterations)
	-- polygonToCellsAllCountries_2: 13361490.400000 microseconds per iteration (5 iterations)
Res 5
	-- polygonToCellsAllCountries_1: 63288210.400000 microseconds per iteration (5 iterations)
	-- polygonToCellsAllCountries_2: 73714663.800000 microseconds per iteration (5 iterations)

Testing Indonesia shows the same pattern, old algo is almost twice as fast by res 7. But testing a smaller set of small Indonesian islands at finer resolutions shows the opposite pattern from res 10 onward:

Res 8
	-- polygonToCellsIndonesia_1: 7767.200000 microseconds per iteration (5 iterations)
	-- polygonToCellsIndonesia_2: 10843.000000 microseconds per iteration (5 iterations)
Res 9
	-- polygonToCellsIndonesia_1: 22478.400000 microseconds per iteration (5 iterations)
	-- polygonToCellsIndonesia_2: 25132.200000 microseconds per iteration (5 iterations)
Res 10
	-- polygonToCellsIndonesia_1: 109577.000000 microseconds per iteration (5 iterations)
	-- polygonToCellsIndonesia_2: 94577.200000 microseconds per iteration (5 iterations)
Res 11
	-- polygonToCellsIndonesia_1: 754375.400000 microseconds per iteration (5 iterations)
	-- polygonToCellsIndonesia_2: 472660.000000 microseconds per iteration (5 iterations)
Res 12
	-- polygonToCellsIndonesia_1: 4869990.800000 microseconds per iteration (5 iterations)
	-- polygonToCellsIndonesia_2: 2903927.600000 microseconds per iteration (5 iterations)

So again, it may be a wash 🤷‍♂️. As far as I can tell, simpler shapes with better compaction yield faster times for the new algo, complex shapes and poor compaction are faster with the old.

@isaacbrodsky
Copy link
Collaborator

isaacbrodsky commented Oct 22, 2023

I put together a little animation of this process (code: https://github.com/isaacbrodsky/h3/tree/updated-polyfill-animation), to aid in visualizing what it's doing. I intentionally cropped out the first and last iterations were it's going over the base cells because the scale is so different compared to where most of the work happens.

30fps (takes a little more than a minute I think):
sf_polyfill

15fps:
sf_polyfill_15fps

@nrabinowitz
Copy link
Collaborator Author

I put together a little animation of this process (code: https://github.com/isaacbrodsky/h3/tree/updated-polyfill-animation), to aid in visualizing what it's doing. I intentionally cropped out the first and last iterations were it's going over the base cells because the scale is so different compared to where most of the work happens.

These are great! I was thinking about making an Observable notebook for this, but it's a little tough without reimplementing the algo, or just animating a log of the output

@nrabinowitz
Copy link
Collaborator Author

On my end, I did some more benchmarking, using all countries with simplified boundaries. The new algo is overall slower for the majority (81%) of the polygons:

image

This shows that the performance difference tends to be significantly worse as the number of cells decreases, with the worst comparisons happening when no cells at all are found within the polygon. I didn't find a correlation between false positives in the bbox checks and bad performance, though I may be looking at it wrong. I have at least one more optimization to try, which would increase false positives but avoid the most expensive parts of cellToBBox.

@nrabinowitz
Copy link
Collaborator Author

Latest commit speeds up the bbox calculation, gaining maybe 20% perf, but still slower for 75% of countries in the benchmark :(.

@nrabinowitz
Copy link
Collaborator Author

Added a lookup table for the res 0 cell bounding boxes, since we check all of them every time, and this seems to have done it! Now in my benchmark (all countries, res 1-5) the new polyfill algo is faster 80% of the time! The main issue here is that the pre-calculated res 0 bboxes seem to wipe out the advantage of the old algorithm for smaller, low-vertex shapes:

image

@coveralls
Copy link

coveralls commented Oct 24, 2023

Coverage Status

coverage: 98.784% (+0.1%) from 98.657% when pulling 6f0a122 on nrabinowitz:updated-polyfill into 5aeff3d on uber:master.

@nrabinowitz
Copy link
Collaborator Author

One additional note here: My new approach to NORMALIZE_LNG seems to result in a lot of cases where we lose branch coverage (and possibly no branch coverage is possible, e.g. if we set the normalization argument explicitly we'll never hit most branches). Open to better options here - I'm also not thrilled that my current implementation hits the most common case, NORMALIZE_NONE, last. In other languages I'd probably pass around a function here, but that's harder and more costly in C. Let me know if you have a better option.

Comment on lines 148 to 155
switch (normalization) {
case NORMALIZE_NONE:
return lng;
case NORMALIZE_EAST:
return lng < 0 ? lng + (double)M_2PI : lng;
case NORMALIZE_WEST:
return lng > 0 ? lng - (double)M_2PI : lng;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GCC is complaining that there's no default case here so it's possible to reach the end of the function without returning a number.

Perhaps this could be rewritten as:

Suggested change
switch (normalization) {
case NORMALIZE_NONE:
return lng;
case NORMALIZE_EAST:
return lng < 0 ? lng + (double)M_2PI : lng;
case NORMALIZE_WEST:
return lng > 0 ? lng - (double)M_2PI : lng;
}
switch (normalization) {
case NORMALIZE_EAST:
return lng < 0 ? lng + (double)M_2PI : lng;
case NORMALIZE_WEST:
return lng > 0 ? lng - (double)M_2PI : lng;
default:
return lng;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks

Comment on lines 161 to 163
// --------------------------------------------
// maxPolygonToCellsSize
// --------------------------------------------
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is maxPolygonToCellsSize modified in this PR? Are these tests duplicated?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are duplicated. My intent was to have copy/paste versions of the old tests with the new function, so that when we drop the old function it would just be removing the -Experimental files. But I guess you're right that there's no point duplicating the maxPolygonToCellsSize tests, I can remove.

Comment on lines +81 to +82
DECLSPEC H3Error H3_EXPORT(polygonToCellsExperimental)(
const GeoPolygon *polygon, int res, uint32_t flags, H3Index *out);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, this is not intended to be part of the public API when this PR is merged (as it is not in h3api.h.in?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct - I added DECLSPEC so that it would be available for export, but I don't expect us to expose it in the API until we transparently replace the old function.

/**
* Get a base cell by number, or H3_NULL if out of bounds
*/
static H3Index getBaseCell(int baseCellNum) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function name looks like it could do the opposite and accept and index and return a number, maybe rename to indexFromBaseCell or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, will rename to baseCellNumToIndex

* Get a base cell by number, or H3_NULL if out of bounds
*/
static H3Index getBaseCell(int baseCellNum) {
if (NEVER(baseCellNum < 0) || baseCellNum >= NUM_BASE_CELLS) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs NEVER if the function is marked non-static and a unit test exercises this case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, can update. It's not clear to me whether static offers any compilation advantages, or whether it just indicates a local function. We settled on static to avoid _underscore names to denote local functions, but I guess it makes sense to test this separately.


// Initialize bounding boxes for polygon and any holes. Memory allocated
// here must be released through iterDestroyPolygonCompact
iter._bboxes = H3_MEMORY(malloc)((polygon->numHoles + 1) * sizeof(BBox));
Copy link
Collaborator

@isaacbrodsky isaacbrodsky Oct 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
iter._bboxes = H3_MEMORY(malloc)((polygon->numHoles + 1) * sizeof(BBox));
iter._bboxes = H3_MEMORY(calloc)(polygon->numHoles + 1, sizeof(BBox));

Requesting this change to head off any integer overflow problem

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update, thanks

@nrabinowitz
Copy link
Collaborator Author

Thanks everyone for the reviews and support on this!

@nrabinowitz nrabinowitz merged commit 5d7c1c1 into uber:master Oct 31, 2023
30 checks passed
@nrabinowitz nrabinowitz deleted the updated-polyfill branch October 31, 2023 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants