Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SphericalVoxelGrid Filter #2171

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Add SphericalVoxelGrid Filter #2171

wants to merge 12 commits into from

Conversation

sddavis14
Copy link

@sddavis14 sddavis14 commented Jan 2, 2018

This filter adds the functionality discussed in issue #2044

Included in this pull request is:

  • the filter implementation
  • a comprehensive unit test which tests the filter against a point cloud that will produce a predictable output
  • a tutorial

As suggested in the issue, I was able to reuse a significant amount of code from the standard VoxelGrid filter. As such, I maintained the old copyright notices and authors with the addition of my name.

It may help to reference the original VoxelGrid source while reviewing this PR.

Thanks.

Maintainer Edit: Closes #2044

Spencer Davis added 3 commits January 2, 2018 02:44
This commit adds the SphericalVoxelGrid filter to the filters library,
which allows point clouds to be downsampled using a leaf size specified
in spherical coordinates.
@taketwo
Copy link
Member

taketwo commented Jan 2, 2018

Hi, thanks for the effort.
A few immediate comments:

  1. Free functions getMaxDistance()
    • Need a better name, because we are not getting a distance, rather a point
    • Need to improve documentation to be clear that the point is searched for in the pointcloud
    • I think the function is too heavy. Essentially, it finds a point among a filtered point cloud. It provides two ways to filter: by index and by field limits. I think the second one should be left out. If needed, the user can first make an index list with a dedicated field limit filter, and then pass it this function. What do you think?
    • Move to "common/distances.h" to encourage potential reusage.
  2. Specialization for PCLPointCloud2. Nowadays we typically do not do this because it means code duplication (and possibility for divergence of two implemenations in future). We assume that user will first convert his point cloud to the "native" PCL format and then perform processing steps. I would say unless for performance reasons you really need this specialization to avoid conversions in your own projects, then we keep it. Otherwise drop it.

@sddavis14
Copy link
Author

  1. Fully agree on the name of getMaxDistance(). However, I chose this name because other overloads of the function are already available in common.h and I wanted to preserve the interface since the function does the same thing (with the added filter constraints). This ties into the next comment -

  2. I agree that the function is pretty complicated. If we choose to drop the filter arguments, the functions will no longer be needed because we can just rely on the getMaxDistance() already present in common.h. The function is pretty specific to this use case, so it probably makes sense to just move the filtering down into the applyFilter() function anyway. I will go ahead and make this change.

  3. I will drop the PointCloud2 implementation. I implemented it because of this line in the "Writing a new PCL Class" tutorial:

The PCL Filtering API specifies that two definitions and implementations must be available for every algorithm: one operating on PointCloud and another one operating on PCLPointCloud2. For the purpose of this tutorial, we will concentrate only on the former.

I imagine it would help other contributors if this were updated.

Thanks for the quick comments.

@taketwo
Copy link
Member

taketwo commented Jan 2, 2018

I will go ahead and make this change.

👍

I imagine it would help other contributors if this were updated

Certainly, we should change it. Thanks for letting us know and sorry for extra effort.

@taketwo
Copy link
Member

taketwo commented Jan 2, 2018

I have one more quick comment regarding phi/theta and setter/getter functions. Maybe it is more intuitive to name the functions like setDivisionsVertical() or getLeafSizeHorizontal()? Otherwise one will always need to check documentation to figure out what is phi and what is theta. @SergioRAgostinho?

@sddavis14
Copy link
Author

Discussed changes are implemented.

I agree with leaf size functions and have renamed the external interface to use the more intuitive vertical/horizontal language. Phi/theta is only internal.

Should I modify that tutorial here or would another PR be more appropriate?

/** \brief Get the radial size of the leaves
*/
inline float
getLeafSizeR () { return (leaf_size_r_); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method can be marked const (same for all the other getters).


/** \brief Get the cartesian origin used to build the spherical grid.
*/
inline Eigen::Vector3f
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any pragmatic reason to have 3-vector in getOrigin but 4-vector in setOrigin?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I believe this was mimicking some of the behavior in VoxelGrid. Is there a particular reason to prefer 4 vs 3 vectors (SIMD instructions or something)? If not, I'll switch setOrigin to take a 3-vector since it needs 3-dimensional coordinates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking into account how this is used in the class (single vector subtraction inside a fat loop), I would be very surprised to see any performance difference. I'd back switching to 3-vectors.

* \param[out] limit_max the maximum allowed field value
*/
inline void
getFilterLimits (double &limit_min, double &limit_max)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you choose API for this function following some other similar existing in PCL? Then let's keep it this way for consistency, otherwise I'd propose to return std::pair<double, double>.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the same filter field interface that VoxelGrid uses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's keep it this way then.

* \param[out] limit_negative true if data \b outside the interval [min; max] is to be returned, false otherwise
*/
inline void
getFilterLimitsNegative (bool &limit_negative) { limit_negative = filter_limit_negative_; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not returning?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preserving consistent interface with VoxelGrid

* Software License Agreement (BSD License)
*
* Point Cloud Library (PCL) - www.pointclouds.org
* Copyright (c) 2009, Willow Garage, Inc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need Willow here. Your contribution is brand new code, even if you were "inspired" by the voxel grid implementation. As such, it should be copyrighted for Open Perception 2018.

{
if (!input_->is_dense)
// Check if the point is invalid
if (!pcl_isfinite (input_->points[*it].x) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be replaced with isFinite (input_->points[*it])

// Get the filter value
const uint8_t* pt_data = reinterpret_cast<const uint8_t*> (&input_->points[*it]);
float filter_value = 0;
memcpy (&filter_value, pt_data + fields[filter_idx].offset, sizeof (float));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being picky, not every field is of type float. Intended use is with float fields though, so this is probably ok.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about I introduce a check to verify that the field is a float?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my previous comment, this assumption that the type is a single precision floating point might need to be revised.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to invoke memcopy to just copy a single float. Make filter_value a pointer and initialize it in the correct position.


// Find the number of radial layers required given the farthest point and resolution
max_radius_ = (max_p - filter_origin_).norm ();
uint64_t rIdxNum = static_cast<uint64_t>(std::floor(max_radius_ / leaf_size_r_)) + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you choose 64 bit here? And wouldn't it be more consistent to cast the division operands to double then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this calculation is to verify that the number of discrete voxels does not overflow an int32_t. The multiplication within the if statement below is performed with uint64_ts to ensure that the potential overflow is detected. In this case, I just performed the cast here instead of within the if statement. I will move the cast into the if statement and switch rIdxNum to int32_t to make things more consistent.


PointT shiftedPoint = input_->points[*it];

shiftedPoint.x -= filter_origin_[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shiftedPoint.getVector4fMap () -= filter_origin_

shiftedPoint.z -= filter_origin_[2];

// Convert to spherical coordinates
float r = std::sqrt (shiftedPoint.x * shiftedPoint.x +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shiftedPoint.getVector3fMap ().norm ()

@taketwo
Copy link
Member

taketwo commented Jan 3, 2018

Should I modify that tutorial here or would another PR be more appropriate?

Please another PR.

Make functions const, make origin function vectors consistent,
delete getMaxDistance declarations, add point field type check
@taketwo
Copy link
Member

taketwo commented Jan 4, 2018

Thanks! From the interface and implementation perspective everything looks fine for me now. Only the formatting does not match the style guide, which requires spaces between function names and opening brackets. (Sorry for that :-))

I'll be away for several weeks starting this weekend. In the meantime I guess @SergioRAgostinho will be back, provide his comments, and then hopefully merge this.

@taketwo taketwo added this to the pcl-1.9.0 milestone Jan 4, 2018
@sddavis14
Copy link
Author

sddavis14 commented Jan 4, 2018

Fixed. Thanks for the quick review!

Also, I need to add the point cloud that the user can download as part of the tutorial. Where do I put this pcd file so that the tutorial link works correctly?

@SergioRAgostinho SergioRAgostinho added the needs: code review Specify why not closed/merged yet label Jan 9, 2018
@taketwo
Copy link
Member

taketwo commented Jan 27, 2018

We typically put PCD files for tutorials into this repository: https://github.com/PointCloudLibrary/data/tree/master/tutorials.

taketwo
taketwo previously approved these changes Jan 27, 2018

project(voxel_grid)

find_package(PCL 1.8.1 REQUIRED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be 1.8.1.99. Reason: 1.8.1 is already released and it does not include your filter. We indicate "the next release after x.x.x" by appending .99 suffix. When time comes to release the next version (whichever number it will have), we will search/replace all occurrences.

@SergioRAgostinho
Copy link
Member

I have one more quick comment regarding phi/theta and setter/getter functions. Maybe it is more intuitive to name the functions like setDivisionsVertical() or getLeafSizeHorizontal()? Otherwise one will always need to check documentation to figure out what is phi and what is theta. @SergioRAgostinho?

Agreed. Off to start reviewing this now 🛫 Sorry for the delay.

Copy link
Member

@SergioRAgostinho SergioRAgostinho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, but already some work.

#include <pcl/filters/spherical_voxel_grid.h>

int
main (int argc, char** argv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't use argc or argv so don't define them.

int
main (int argc, char** argv)
{
pcl::PointCloud<pcl::PointXYZI>::Ptr input(new pcl::PointCloud<pcl::PointXYZI>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing

main (int argc, char** argv)
{
pcl::PointCloud<pcl::PointXYZI>::Ptr input(new pcl::PointCloud<pcl::PointXYZI>);
pcl::PointCloud<pcl::PointXYZI>::Ptr output(new pcl::PointCloud<pcl::PointXYZI>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need be a shared pointer. You're alway employing it like this *output.

Spacing.


pcl::io::loadPCDFile<pcl::PointXYZI> ("spherical_voxel_grid_example.pcd", *input);

std::cout << "Size before downsample: " << input->points.size() << std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use input->size () instead.

Spacing.

voxel.setLeafSize (0.1, 90, 300);
voxel.filter (*output);

std::cout << "Size after downsample: " << output->points.size() << std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use output->size () instead.

Spacing.


/** \brief Get the cartesian origin used to build the spherical grid.
*/
inline Eigen::Vector3f
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return const Eigen::Vector3f& instead. That way no copy is generated if the user doesn't want it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we are returning a result of head<3>() function call, this can not be a reference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then what about changing the getters and setters to Eigen::Vector4f which is in reality what we store?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we store is an implementation detail, I don't think it should influence our getters/setters. So the question is rather what is more convenient for the user: non-homogeneous or homogeneous coordinates? And I have no answer to this :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All our point types are inherently homogenous and we do provide Eigen maps for both 3f and 4f types, from and to them, so I feel having the Vector4f version in the getter/setter places no inconvenience from the user perspective, allowing us to "potentially" save a copy.

(That was a long sentence!)

Anyway, It's not a big thing and since we don't seem to have a clear opinion here I leave it up to @sddavis14 to decide in light of the additional information we provided.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will come back to this issue and make a final decision when time permits. For now, I have fixed the other discussed problems.

setFilterFieldName (const std::string &field_name) { filter_field_name_ = field_name; }

/** \brief Get the name of the field used for filtering. */
inline std::string const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return a const std::string& instead for the same reasons mentioned above.

double filter_limit_min_;

/** \brief The maximum allowed filter value a point will be considered from. */
double filter_limit_max_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PCL coordinates are all stored in floats. I'm not sure if makes sense to have doubles representing the filters limits. @taketwo ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable

* \param[out] limit_negative true if data \b outside the interval [min; max] is to be returned, false otherwise
*/
inline void
getFilterLimitsNegative (bool &limit_negative) const { limit_negative = filter_limit_negative_; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method seem to be copy of the one below. I vote for keeping the one below.

bool filter_limit_negative_;

/** Cartesian origin of the spherical coordinate system. */
Eigen::Vector4f filter_origin_;
Copy link
Member

@SergioRAgostinho SergioRAgostinho Jan 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you store and eigen vectorizable type internally in your class you need to define the Eigen macro for overloading the new operator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was already done in the base class Filter

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that's enough? I never looked into the internals of the macro to understand what exactly it does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SergioRAgostinho SergioRAgostinho added needs: author reply Specify why not closed/merged yet and removed needs: code review Specify why not closed/merged yet labels Jan 28, 2018
@SergioRAgostinho SergioRAgostinho added needs: code review Specify why not closed/merged yet and removed needs: author reply Specify why not closed/merged yet labels Feb 6, 2018
@SergioRAgostinho SergioRAgostinho self-assigned this Feb 9, 2018
Copy link
Member

@SergioRAgostinho SergioRAgostinho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: Sorry for the bad copy paste.

  • I'm not particularly fond with the whole sorting mechanism to "pseudo cluster" indices. I believe you should use a vector/map of vectors of indices to cluster things. Sorting is a N*log(N) which in this case can be avoided. Also, since the downsampling is done based on averaging, once you build your "spherical voxel" grid you can literally do everything in a single pass, because you can compute an average in an online fashion. @PointCloudLibrary/admins opinions?

  • We specify angular resolution in terms of divisions, but I feel that in a normal case I would want to specify the vertical and horizontal resolution in terms of angles. Working in divisions ensures that all voxels span the same vertical and horizontal angles uniformly, which is something we can't ensure if a user would suggest something like 100º for the vertical resolution since it's not a divisor of 180º. Therefore it would be cool to have the option of "suggesting" a resolution specifying angles and then we would "round" that suggestion ensuring we have uniform angular intervals.

  • I believe we should further exploit the possibility of having a common base class, which serves as a basis for all other VoxelGrid classes? These parameters are common to both VoxelGrid and Spherical: min_points_per_voxel_, filter_field_name_, filter_limit_min_, filter_limit_max_, filter_limit_negative_. The point is, there are bounding box/filter functionalities that should be common to some extent to all voxel grid filters and I feel those should be shared if possible. I'm unsure at this point if the best option if to create a new base class or refactor/virtualize some of the VoxelGrid methods. Request for comments @PointCloudLibrary/admins .

  • Regarding the unit tests, please have a look at Test Fixture and if you want to have a look at a comprehensive example implemented into PCL the octree iterator test is the way to go.

  • I suggested a few refactoring actions in the applyFilter method in hopes of improving its readability. I've personally been trying this rule of thumb of making sure no single function spans longer than a full screen for a while now and it has served my well. This assumes of course there are no major penalties for doing so. I don't believe there are any in the cases I suggested. Let's leave those changes to the very last since the code can still change considerably in the meantime.

  • Nice to have: every time you're defining local variables, declare them const whenever they're actually const.


project(spherical_voxel_grid)

find_package(PCL 1.8.1.99 REQUIRED)
Copy link
Member

@SergioRAgostinho SergioRAgostinho Feb 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going for 1.9.0 so you can change it already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though we know this will be 1.9.0, such version does not exist yet, current master compiles into 1.8.1.99. If we already request 1.9.0, it will be impossible to build this tutorial.

Copy link
Member

@SergioRAgostinho SergioRAgostinho Feb 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This places on us the responsibility of greping everything before the release and bumping versions on all tutorials. But that's in fact what we committed to, as I can see in #1683.

Edit: I updated the wiki page with that info.

--------

First, download the dataset `spherical_voxel_grid_example.pcd
<https://raw.github.com/PointCloudLibrary/data/master/tutorials/spherical_voxel_grid_example.pcd>`_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this file in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the file is in the data repository. Still, it's missing, you are right.

Copy link
Member

@SergioRAgostinho SergioRAgostinho Feb 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning we're gonna need a PR there adding the file. @sddavis14 please then add a link to the PR here for traceability purposes.


#define PCL_INSTANTIATE_SphericalVoxelGrid(T) template class PCL_EXPORTS pcl::SphericalVoxelGrid<T>;

#endif // PCL_FILTERS_SPHERICAL_VOXEL_GRID_IMPL_H_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New line at the end of the file.

* itself well to reducing the noise and modifying the resolutions of point clouds
* generated by scanners which sample in a spherical manner.
*
* \author Spencer Davis, Radu B. Rusu, Bastian Steder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave it. It was more a curiosity remark than an actual rectification comment.

/** \brief Set to true if we want to return the data outside (\a filter_limit_min_;\a filter_limit_max_). Default: false. */
bool filter_limit_negative_;

/** Cartesian origin of the spherical coordinate system. */
Copy link
Member

@SergioRAgostinho SergioRAgostinho Feb 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\brief tag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary because we have JAVADOC_AUTOBRIEF=ON.

EXPECT_NEAR(output_cloud->points[3].intensity, 15, 0.001);


/* Test origin movement */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to go to a different test.

EXPECT_NEAR(output_cloud->points[3].intensity, 15, 0.001);


/* Test data filter */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to go to a different test.

EXPECT_NEAR(output_cloud->points[3].intensity, 20, 0.001);


/* Test data filter negative */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to go to a different unit test.

EXPECT_NEAR(output_cloud->points[3].g, 35, 0.001);
EXPECT_NEAR(output_cloud->points[3].b, 45, 0.001);

voxel.setInputCloud(input_cloudrgb);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to go to a different unit test.

EXPECT_NEAR(output_cloud->points[3].b, 45, 0.001);
EXPECT_NEAR(output_cloud->points[3].a, 55, 0.001);

voxel.setInputCloud(input_cloudrgba);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to go to a different unit test.

@SergioRAgostinho SergioRAgostinho added needs: author reply Specify why not closed/merged yet and removed needs: code review Specify why not closed/merged yet labels Feb 11, 2018
@SergioRAgostinho SergioRAgostinho removed their assignment Feb 11, 2018
@taketwo
Copy link
Member

taketwo commented Feb 11, 2018

I'm not particularly fond with the whole sorting mechanism to "pseudo cluster" indices. I believe you should use a vector/map of vectors of indices to cluster things. Sorting is a N*log(N) which in this case can be avoided.

Using std::map is also not cheap, e.g. lookup and insertion are log(N) operations. I don't think what you propose is necessarily always strictly faster. One would need a proper algorithm analysis and knowledge of hidden constants or (even better) comprehensive empirical tests to reach solid conclusions.

PCL is a research-grade library. I don't think we can require our contributors to optimize the heck out of the code they submit. If performance of SphericalVoxelGrid will turn out to be critical for someone... well, he can do the job of optimizing! (This in mind, a good unit test coverage is way more important, since it will allow to adopt optimized version without fear of breaking things.)

We specify angular resolution in terms of divisions, but I feel that in a normal case I would want to specify the vertical and horizontal resolution in terms of angles.

The API can be extended in future. We can not (and should not) catch all use-cases and "what if"s upfront.

I believe we should further exploit the possibility of having a common base class, which serves as a basis for all other VoxelGrid classes?

I agree this would be great. However, there was some reason @sddavis14 did not do this, right?

@sddavis14
Copy link
Author

A couple of thoughts based on everyone's comments:

The original intent of this PR was to develop this functionality separately from the original VoxelGrid while reusing as much code as possible. This meant that this class was essentially a copy of the VoxelGrid with modified index calculations. A lot of the problems we're identifying are problems common to both classes because the code is so similar. As we make improvements here, we are deviating more and more from maintaining consistency with the original VoxelGrid.

My original assumption was that we would not want to modify the original VoxelGrid source. However, if we are open to refactoring things using a VoxelGrid base class, I can contribute the change.

My question then becomes: How much do we want to modify the original VoxelGrid?

We've identified several problems common to both the SphericalVoxelGrid and the standard VoxelGrid:

  1. Extraneous functionality. The filtering is misplaced (user should use PassThrough before VoxelGrid). The original VoxelGrid also contains some search functionality which was not copied here because it did not translate well to the new coordinate system.
  2. The use of a sort for building the voxels. The efficiency of switching to a std::map may be debatable, but I think the code could be simplified significantly (also should be able to avoid the cloud_point_index_idx struct) by switching.
  3. The applyFilter function needs refactoring & less code repetition.

Should we limit our changes to things that are internal to the functionality of the filter, or are we OK with dropping some of the extraneous functionality? Is it worth the performance uncertainty to switch to a std::map for both filters?

@SergioRAgostinho
Copy link
Member

SergioRAgostinho commented Feb 12, 2018

Using std::map is also not cheap, e.g. lookup and insertion are log(N) operations. I don't think what you propose is necessarily always strictly faster. One would need a proper algorithm analysis and knowledge of hidden constants or (even better) comprehensive empirical tests to reach solid conclusions.

That is true. The vector of vectors will still provide some warranties though: we have inexpensive indexing, linear time for traversal and there's no insertion per se, it can be initialized in one go. Even if we start N points being N somewhere in 10^6 ~ 9 orders I would assume the typical voxelization strategies aim to keep things around 10^4 ~ 5 output points which should be fairly tractable and better than the current N*log(N) sort.

PCL is a research-grade library. I don't think we can require our contributors to optimize the heck out of the code they submit. If performance of SphericalVoxelGrid will turn out to be critical for someone... well, he can do the job of optimizing! (This in mind, a good unit test coverage is way more important, since it will allow to adopt optimized version without fear of breaking things.)

Asking is "free of charge". If the user is willing to go through the trouble of investing his own time in doing so, that's something else. But I still believe that if we know that something can be done better in some sense, we should put those ideas on the table. But I understand your point: having an actual need for a performant implementation will always drive/motivate to go through those changes.

My original assumption was that we would not want to modify the original VoxelGrid source. However, if we are open to refactoring things using a VoxelGrid base class, I can contribute the change

I'm very pro towards this and I cannot thank you enough for the help you're willing to provide.

My question then becomes: How much do we want to modify the original VoxelGrid?

We've identified several problems common to both the SphericalVoxelGrid and the standard VoxelGrid:

  1. Extraneous functionality. The filtering is misplaced (user should use PassThrough before VoxelGrid). The original VoxelGrid also contains some search functionality which was not copied here because it did not translate well to the new coordinate system.

I would deprecate voxel grid of these functionalities and hint the user to use the PassThrough filter in advance. I would remove all code and use the PassThrough filter internally during the deprecation stage.

  1. The use of a sort for building the voxels. The efficiency of switching to a std::map may be debatable, but I think the code could be simplified significantly (also should be able to avoid the cloud_point_index_idx struct) by switching.

That is indeed true but I still believe the vector of vectors to store the "clusters" might provide us some performance benefits.

  1. The applyFilter function needs refactoring & less code repetition.

Some additional insights (not fully sure about some):

  1. At the end of this there will be 4 voxel grid related classes: VoxelGrid, SphericalVoxelGrid, VoxelGridCovariance and VoxelGridOcclusionEstimation.
  2. I believe SphericalVoxelGrid uses a spherical space partitioning but all the other use a Cartesian partitioning
  3. VoxelGrid and SphericalVoxelGrid employ averaging as their downsampling/filtering method, VoxelGridCovariance computes normal distribution parameters for each voxel and VoxelGridOcclusionEstimation performs a visibility test.

So at first sight for all voxel filters, there's a stage on how to voxelize/partition space and then the actual filtering/processing stage. I believe we should highlight this in the class design.

@SergioRAgostinho SergioRAgostinho added status: needs decision and removed needs: author reply Specify why not closed/merged yet labels Feb 12, 2018
@taketwo
Copy link
Member

taketwo commented Feb 12, 2018

Agree with everything.

Considering your insights 2 and 3, what if instead of VoxelGrid, SphericalVoxelGrid, and VoxelGridCovariance, we have a single class VoxelGrid with swappable partitioning and filtering strategies? (Either at compile time with template parameters or at runtime with setters/getters and strategy objects.)

@SergioRAgostinho
Copy link
Member

Opened a new "discussion space" in #2211, to avoid going too far out of topic in this PR.

In the meantime @sddavis14, just let us know if you feel like embarking on this journey. Your contribution is already valuable but if you feel like lending us an extra hand, your help is definitely welcome.

@sddavis14
Copy link
Author

@SergioRAgostinho Yes, I would like to be involved. This seems like a longer-term project. Are we planning to include this whole change in PCL release 1.9.0? If so, it seems like I should abandon the current implementation of SphericalVoxelGrid and focus on the refactoring efforts.

As another note, I've done some very quick benchmarks on my machine to compare the std::map method to the sorting method. The map method used a map of integer indices to pcl::CentroidPoints Here are the results:

Sort method
Input size: 35594
Output size: 22874
Time (us): ~5900us

Map Method
Input size: 35594
Output size: 22874
Time (us): ~9500us

Sort method
Input size: 35594
Output size: 2877
Time (us): ~5200us

Map Method
Input size: 35594
Output size: 2877
Time (us): ~4300us

As you can see, the map method is slower when there are a large number of voxels compared to the input point cloud size. As the number of voxels is decreased, the map method eventually becomes faster. The code is also significantly simpler with the map. I haven't had time to try this with a vector of vectors because that implementation is a little more complex.

Do you all think the map method is worth investigating further?

@SergioRAgostinho
Copy link
Member

SergioRAgostinho commented Feb 12, 2018

@SergioRAgostinho Yes, I would like to be involved. This seems like a longer-term project. Are we planning to include this whole change in PCL release 1.9.0? If so, it seems like I should abandon the current implementation of SphericalVoxelGrid and focus on the refactoring efforts.

I would say that it's not a good idea to block 1.9.0 until this is done so I would move it to the next milestone. We went on a somewhat similar (and yet much lighter) effort with the octree module and that one started in October and November September and it's only now reaching its final stages.

Do you all think the map method is worth investigating further?

Thanks for that. Just as quick sanity check reminder don't forget to compile with optimizations enabled.

Given those results I would say that the map of vectors is to be dropped. I believe most people voxelize to stay in the range between 10^4-6 points and from your numbers the map is falling short on the transition from 10^4 to 10^5 already.

I'm really curious on the vector of vectors performance now.

Edit: Also, assuming your computer can handle the stress, it might be worth to try on the limits I mentioned in my last comments

Case 1

Nr of Input Points: 10⁷
Nr of Output Points: 10⁴

Case 2

Nr of Input Points: 10⁹
Nr of Output Points: 10⁴

Case 3

Nr of Input Points: 10⁷
Nr of Output Points: 10⁶

Case 4

Nr of Input Points: 10⁹
Nr of Output Points: 10⁶

@Duffycola
Copy link

Duffycola commented Jul 25, 2018

@sddavis14:

As you can see, the map method is slower when there are a large number of voxels compared to the input point cloud size.

Would you be able to do the comparison again using std::unordered_map instead of std::map? std::map is a sorted version that internally uses red-black trees. So insertion requires log(N), which you notice in your experiments with larger N. In comparison search, insertion and removal in std::unordered_map have average constant-time complexity.

Is there anywhere I could have a look at the std::map code?

@Duffycola
Copy link

Nice to have: Option to use a log-polar mapping instead of a equidistant polar mapping.

@sddavis14
Copy link
Author

@Duffycola unordered_map is definitely a better solution. However, it requires C++11. Not sure what the current integration timeline for C++11/14 support is @SergioRAgostinho.

I've finally had some time to revisit this issue over the past few days and I've actually built a custom version that uses a hash table. Experimentally, this improves performance for most leaf size/point cloud size combinations. I will try compiling the library with C++11 and just using unordered_map to see how that does vs my (presumably naive, but no C++11) hash table implementation.

I can post the applyFilter function source after I've written it to discuss.

In the linked issue discussion (below) @jefft255 mentions that support for smaller leaf sizes is important in some applications. I think solving this particular problem should be another priority of our changes.

As discussed in the issue, the goal is to support any style of mapping. Any help with the class design to accomplish this effectively would definitely be appreciated.

It probably makes sense to continue this discussion in the rewrite issue: #2211

@taketwo
Copy link
Member

taketwo commented Jul 25, 2018

Not sure what the current integration timeline for C++11/14 support

Feel free to use C++11/14 features in your code. Although we don't have fixed dates, I'm positive that by the time your code is ready to merge, the library will require C++14.

@Duffycola
Copy link

@sddavis:

unordered_map is definitely a better solution. However, it requires C++11.

I see.. I think one could also use a std::vector assuming it's ok to compute the min/max dimensions of the cloud before allocating a fixed-size vector.

I will pass some comments in the other thread...

@SergioRAgostinho SergioRAgostinho removed this from the pcl-1.9.0 milestone Aug 26, 2018
@stale
Copy link

stale bot commented Feb 22, 2020

This pull request has been automatically marked as stale because it hasn't had
any activity in the past 60 days. Commenting or adding a new commit to the
pull request will revert this.

Come back whenever you have time. We look forward to your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: proposal Type of issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SphericalVoxelGrid Filter Contribution
5 participants