Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
DATAPTR()
usage in character implementationExtract_subset
implementation? Or is the fallback good enough? (New approach is much faster even withoutExtract_subset
due to fasterElt
access with cached pointers, we really want base R to handle all the intricacies of this.)#define
s to section off code that doesn't work in older R versions, possibly bump minimal rlang R versionr_init_library_with_dll()
helper with LionelThen
r_vec_resize()
tor_vec_grow()
and remove support for shrinkingr_dyn_unwrap()
in this PRAdds support for ALTREP contiguous views, i.e.
vec_view(x, start, size)
, for the major R typesA few notes:
x
gets marked as not mutable on the way inx
right nowr_init_library_with_dll()
helper that goes in yourR_init_<pkg>()
function (good to know if you are utilize the rlang C API). The ALTREP classes are registered with thepackage
name, so if vctrs uses the rlang C API then it will register its own ALTREP view classes under the"vctrs"
package name. I think this is a good thing.Here are important notes on the internal structure:
Benchmarking - The most notable thing in the benchmarks is that slicing with a large slice can be up to 2x slower than native code due to
ExtractSubset()
in base R using the*_Elt()
ALTREP method to extract out each element. I think it could be much more efficient by usingDataptr_or_null()
to first see if it could get a cheap readonly dataptr to the altrep data (it can here, it's a view) and then using that directly. I plan to submit that patch or talk to Luke about it, then this difference would poof go away entirely.Also note that the difference is mostly apparent only when a contiguous slice is made with
ExtractSubset()
, where native objects probably have 0 cache misses, but we have to go through an ALTREP method each time. When you extract a random slice of the same size, the timings actually tend to converge and cache misses become more important.Otherwise I think it's pretty darn good!