Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arrow deser: Add zero-copy happy path #3109

Closed
emilk opened this issue Aug 25, 2023 · 7 comments
Closed

Arrow deser: Add zero-copy happy path #3109

emilk opened this issue Aug 25, 2023 · 7 comments
Labels
codegen/idl enhancement New feature or request 🚀 performance Optimization, memory use, etc

Comments

@emilk
Copy link
Member

emilk commented Aug 25, 2023

Currently process_arch_view in crates/re_space_view_spatial/src/parts/points3d.rs will bottom-out in Loggable::try_iter_from_arrow which returns an iterator. We should be able to return slices to the underlying arrow buffer instead (zero-copy) for simple things like positions, radii, colors, etc.

Complex things (e.g. struct) we cannot do this, because Arrow is SoA and we want AoS.

We should consider replacing try_iter_from_arrow with:

  fn try_collection_from_arrow(
      data: &dyn ::arrow2::array::Array,
  ) -> DeserializationResult<Collection<Self>>;enum Collection<T>
{
    Sliceable(DataCell),
    Collected(Vec<T>),
}

impl<T> Collection<T>
{
    pub fn as_slice(&self) -> &[T] {}
}

For the happy-path, the codegen will generate return a slice. For other cases it would iterate and collect, and return the Vec.

This makes the calling code very simple, and the happy path very fast.
The unhappy path is not slower - we need to do a collect at some point anyway.

@emilk
Copy link
Member Author

emilk commented Aug 27, 2023

Similarly, try_from_arrow_opt could return something like:

enum CollectionOpt<T> {
    /// Missing => all `None`
    Empty { num_instances: usize }

    /// No nulls
    Dense(DataCell),
    
    Collected(Vec<Option<T>>),
}

this would greatly optimize the happy-path of e.g. color deserialization (happy = no nulls),
but would also allow optimizing for when a component is missing (e.g. no radii => auto for everything)

@jleibs
Copy link
Member

jleibs commented Aug 28, 2023

We should definitely do this, but the bulk of the slowdown from colors is still from the usage of the joining iterator, rather than identifying that points and colors both have the same instance-ids and going direct to the native iterator.

@emilk
Copy link
Member Author

emilk commented Sep 6, 2023

We should definitely do this, but the bulk of the slowdown from colors is still from the usage of the joining iterator, rather than identifying that points and colors both have the same instance-ids and going direct to the native iterator.

Right you are!

Here's open_photogrammetry_format in release mode:

image

Zero-copy deser would shave off ~4ms of try_from_arrow-opt "rerun.colorrgba", while the joining iterator optimization would potentially shave off a full 9ms (the "colors" collection).

Here's the related issue:

@Wumpf Wumpf added the 🚀 performance Optimization, memory use, etc label Sep 11, 2023
@emilk
Copy link
Member Author

emilk commented Oct 10, 2023

Here's the latest flamegraph of what happens in wasm (serial):

Image

Positions are collected twice before being sent to the gpu, but we don't need to collect at all in this case, so that's around 1.7ms we can save.

For radii we construct an Iterator<Item = Option<f32>> where everything is None, then collect that into a Vec<Size>.
If we knew ahead of time that everything would be None we could early-out and just construct a vec![Size::AUTO; count] should should save maybe 3ms.

For colors we currently collect a Vec<Option<T>> where everything is Some, then collect that into a Vec<T>.
I think we could make this a lot faster, but hard to tell how much.

The instance keys are collected twice. We can save 0.5 ms there.

Conservatively counting that is at least another 50% speedup for wasm with this simple optimization.

@teh-cmc
Copy link
Member

teh-cmc commented Oct 11, 2023

At this point I would hold off on this until we're done migrating to arrow-rs I think.

@emilk emilk removed this from the 0.10 Polish (non-blocking) milestone Oct 18, 2023
@emilk
Copy link
Member Author

emilk commented Aug 5, 2024

@teh-cmc is this basically done now?

@teh-cmc
Copy link
Member

teh-cmc commented Aug 5, 2024

Hmm yeah, I guess so.

There's still a lot of cleanup I want to do regarding the code generated for deserialization (for the few cases where we still rely on it), but that's somewhat orthogonal to this.

@teh-cmc teh-cmc closed this as completed Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen/idl enhancement New feature or request 🚀 performance Optimization, memory use, etc
Projects
None yet
Development

No branches or pull requests

4 participants