Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orthogonal selection #108

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open

Orthogonal selection #108

wants to merge 25 commits into from

Conversation

Artur-man
Copy link

@Artur-man Artur-man commented Sep 3, 2024

Summary:

  • OrthogonalIndexer and IntArrayDimIndexer
  • Order class from zarr-python
  • IntDimIndexer (may need some work)
  • Future work: BoolArrayDimIndexer
  • update normalization for integer vectors
  • add methods for int and zb_int (are these needed?), currently implemented but not used ...
  • zero_based_to_one_based function now also applies to none slice selections
  • [.ZarrArray triggers orthogonal selection only (or should it?)
  • add tests for orthogonal selection

Relevant Issues: #107 #104 #97 #43

@Artur-man
Copy link
Author

Artur-man commented Sep 3, 2024

Here it is, tested a bunch of times but who knows :D Let me know what you guys think and lets review if you want.

@Artur-man
Copy link
Author

Artur-man commented Sep 7, 2024

Commented out exported functions under int.R to let pkgdown do its thing ... although I am not sure if int() or zb_int() are indeed needed.

Copy link
Owner

@keller-mark keller-mark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Artur-man thank you very much for the PR. I had some comments about additional tests to add for the new internal classes. Since they contain indexing arithmetic it would be great to have unit tests to at least ensure there are no regressions.

@@ -0,0 +1,46 @@
manage_filters <- function(filters) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to add documentation comments for this function?

@@ -1,3 +1,28 @@
is_pure_fancy_indexing <- function(selection, ndim = length(selection)) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to add a test for this function and documentation comments to convey the purpose?

R/indexing.R Show resolved Hide resolved
R/indexing.R Outdated Show resolved Hide resolved
#' @param array ZarrArray object that will be indexed
#' @rdname OrthogonalIndexer
#' @keywords internal
OrthogonalIndexer <- R6::R6Class("OrthogonalIndexer",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to add a test for this OrthogonalIndexer class? I know python may not have any analogous tests but maybe the current behavior can be tested like for BasicIndexer https://github.com/keller-mark/pizzarr/blob/main/tests/testthat/test-indexing-basic.R#L26 to prevent regressions

#' TODO
#' @rdname Order
#' @keywords internal
Order <- R6::R6Class("Order",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, would you be able to add a test for this Order class?

#' TODO
#' @rdname IntArrayDimIndexer
#' @keywords internal
IntArrayDimIndexer <- R6::R6Class("IntArrayDimIndexer",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, would you be able to add a test for this IntArrayDimIndexer class?

#' #' with zero-based indexing
#' #' @param index integer index
#' #' @export
#' zb_int <- function(index) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could make sense to have a zb_int function so that users who want to use zero-based / python-style indexing can do so without much thinking. It could just do index + 1 i think

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do!

R/zarr-array.R Outdated Show resolved Hide resolved
expect_error(z[1], "This Zarr object has 2 dimensions, 1 were supplied")

# test `[.ZarrArray` against get_item
expect_equal(z[1:2,5], z$get_item(list(slice(1, 3), slice(5, 5))))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this line was previously

expect_equal(z[1:3,5], z$get_item(list(slice(1, 3), slice(5, 5))))

Did the behavior change?

Copy link
Author

@Artur-man Artur-man Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I have been meaning to discuss this, I think 1:2 and 1:3 have the same result here, but I guess I was experimenting with how to treat :.

Note that the shape of a is (2,10) and I think slice(1,3) would call python-based indices (0,1,2) which are the first three rows but I think slice doesn't care if indices are outside of dimension limits ?

> z$get_item(list(slice(1, 10), slice(5, 5)))$data
     [,1]
[1,]    9
[2,]   10
> z$get_item(list(slice(1, 20), slice(5, 5)))$data
     [,1]
[1,]    9
[2,]   10

As a remark all dimensional indices with a:b would be converted to slice(a,b), and this is still the case with manage_filters. See below:

pizzarr/R/zarr-array.R

Lines 1231 to 1235 in f84355d

} else if(typeof(x) == "language") {
x <- as.list(x)
# Return a range (supplied via : or seq())
if(x[[1]] == ":") {
return(slice(x[[2]], x[[3]]))

A typical R-based indexing with a:b and [ method would fail in this situation. Lets see what happens when we incorporate the same indexing without/with ZarrArray.

# with ZarrArray
> z <- zarr_create(shape=dim(a), dtype="<f4", fill_value=NA)
> z$set_item("...", a)
> z[1:3,5]$data
     [,1]
[1,]    9
[2,]   10

# without ZarrArray
> a <- array(data=1:20, dim=c(2, 10))
> a[1:3,5]
Error in a[1:3, 5] : subscript out of bounds

It is up to you @keller-mark, how would you like [ to behave.

@Artur-man Artur-man mentioned this pull request Sep 10, 2024
Artur-man and others added 3 commits September 10, 2024 22:05
Co-authored-by: Mark Keller <7525285+keller-mark@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants