-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DIPjavaio to handle files with multiple images #156
Comments
I assume calling Can you expand on you preference for returning a vector? Do you expect to read images with different dimensionalities/properties in other formats, but not in TIFF? |
TIFF can also have images of different sizes. But I've never seen a TIFF file with 50k images in it. Usually they're related, they're either slices of a 3D image, or they're scaled versions of the same image (a pyramid), etc. We can handle these cases well with the current TIFF reading code in DIPlib. This CIF file has 50k images of a single cell each. Each image has 6 channels and is tiny (~80 pixels square), but they're all cut to a different size depending on the size of the cell. BTW, I didn't post this issue to have you do the work, I just wanted to record my thoughts and hopefully get some good suggestions. |
@wcaarls Take a look at what I did so far: 28d2fd34eaa509949eff744d1a3a45633a144a1b |
If the images are different sizes, you would indeed need to return a vector somehow. The code looks good! How is the overhead of reading 50k images like that? If the overhead is too high, another option (although perhaps not the best one) is to introduce state to the interface, where you first open the image, then read however many images you want, and then close it. |
This is indeed quite slow. Opening this file takes a second or so every time. I think reading in a series of images as a vector, specified through a I'm thinking there's two options:
The issue with option 2 is that we'll be limited by the Java memory. In option 1, Java only reads one image at the time, so it won't be overwhelmed. I have no idea which option is easier to implement... And I don't know if option 1 is most of the way towards the stateful reader? |
I prefer option 1, which indeed is most of the way there to a stateful reader. The difference (and perhaps advantage) is that the user does not see the statefulness. To implement it, the |
Another advantage would be that we can easily implement a |
In Bio-Formats,
reader.getSeriesCount()
will return the number of images in the file, andreader.setSeries(i)
will configure the reader to start reading the image numberi
.We could add a parameter to
dip::ImageReadJavaIO()
:to indicate which image to read, and with a default value of 0. It being at the end is ugly but won't break code that currently uses this function.
dip::ImageReadTIFF()
actually has adip::Range
parameter for the image number, and will concatenate all the images read. I'm not sure this is useful in the generic case ofImageReadJavaIO
, which can deal with so many different file types. I'd rather write a new function that populates astd::vector
of images. Right now I'm dealing with a CIF file that contains 50k tiny images, several Gb all together, I don't think it's a good idea to try to read that in one go. But on the other hand, some of these multi-image file formats don't have an index that points to each image, and the reader has to pass through all images before the one you want to read (see for example TIFF). So calling a reader function for each image is terrible. For these cases, you really want to initialize the reader, and return an iterator over images. But then the API starts to become quite complex...Oh, OK,
setSeries()
should be fast. If so, opening the file could be slow? It's probably still more efficient to read many images in one go than callingdip::ImageReadJavaIO()
for each image in a large file.The text was updated successfully, but these errors were encountered: