Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose gifti data API #789

Closed
htwangtw opened this issue Aug 8, 2019 · 3 comments
Closed

Propose gifti data API #789

htwangtw opened this issue Aug 8, 2019 · 3 comments
Labels
Milestone

Comments

@htwangtw
Copy link
Contributor

htwangtw commented Aug 8, 2019

I am working with HCP Conte69 surface data in resolution that's not supported in the official HCP pipeline. I mostly work with surf.gii, shape.gii, func.gii and cifti format. I will focus on gifti in this issue as cifti is a different file format. I found the nibabel gifti function a bit difficult to use for read/write the actual data. I would like to propose a new API aiming at more intuitive access to the data and save new output.

The most important information for me as a user is

  1. the intents of the darrys
  2. size of the collection of darray (currently supported through GiftiImage.numDA)
  3. resolution (ie. mesh face and coordinate numbers)
  4. access data as numpy arrays rather than going through all GiftiImage.darrays to concatenate the information (similar to the output from nilearn.surface.load_surface_mesh)

Header-like method to access information of in point 1 to 3

The current GiftiImage.print_summary() method will give you all the metadata, but the only useful information is the metadata in the intents. Currently, you have to access the darrays first to get more useful information.
GiftiImage.darrays[0].print_summary()
Some
I would like to have a header that can access the information about the intents.
Something like this:

GiftiImage.header 
GiftiImage.header.intents - what are the intents of the data array
GiftiImage.header.meta - for the metadata

Access data

For .func.gii and other metric files, it would be nice to concatenate all darrys for timeseires data through get_data().
For any kind of mesh (surf.gii), the method will return a tuple of corrdinates and faces.
The preliminary idea is as follow:

get_data(intent_code=None):
    timeseries = vstack(da.data for da in get_array_by_intent)
    coords = get_arrays(pointset).data
    faces = get_arrays(triangles).data
    return (coords, faces)
    if None:
        tuple(da.data for da in self.darrays)

Creating gifti files

Currently I replace the data in exisitng gifti files to create new image. It would be helpful if I can just pass a dictionary with the intents as keys and the data as the entry to create a new file.
For surface mesh (surf.gii), the user can pass a tuple of (coords, faces) and the funciton will translate that into data arrays. For timeseries data or other metric data (func.gii, shape.gii, and so on), we can have a np array of the size time x faces and parse each timepoint as a darray to fit the gifti format.

@effigies
Copy link
Member

effigies commented Aug 9, 2019

Hi @htwangtw, thanks for starting this conversation. I definitely agree that GIFTI could stand a friendlier API.

I like get_data(self, intent=None). GiftiImage does not currently have a get_data, but this is a method on the DataobjImage class, so we might want a slightly different name to avoid confusion. Just as a name off the top of my head, aggregate_data (or agg_data()?) might make clear that we're doing some work and not simply extracting/munging a DataobjImage data block.

I'll sketch out a bit more of what I'm thinking for this method (let me know what you think):

class GiftiImage:
    def agg_data(self, intent_code=None):
        # Allow multiple intents to specify the order
        # e.g., agg_data(('pointset', 'triangle')) ensures consistent order
        if isinstance(intent_code, tuple):
            return tuple(self.agg_data(intent_code=code) for code in intent_code)

        # If only one type of data array, use that intent code.
        # The idea here would be to allow agg_data() == agg_data('time_series') for time series files

        darrays = self.darrays if intent_code is None else self.get_arrays_from_intent(intent_code)
        all_data = tuple(da.data for da in darrays)
        
        if intent_codes.niistring[intent_code] == "NIFTI_INTENT_TIME_SERIES":
            return np.column_stack(all_data)

        # Possibly other sensible things to do for other intents

        return all_data

And to think of some constructors:

class GiftiImage:
    @classmethod
    def new_timeseries(cls, data):
        ...

    @classmethod
    def new_surface(cls, coordinates, triangle):
        ...

These seem like the common file types I see in the wild, so even without a complete set of constructors for the officially listed file types, I think these would be a great improvement.

I haven't thought through a GiftiHeader, yet. The big issue I think is going to be making sure that changes to it sync back to the image in an unsurprising way.

I also wonder if it would make sense to add properties for querying whether an image looks like a canonical type, for example:

@property
def is_surface(self):
    return sorted(da.intent for da in darrays) == [intent_codes['pointset'], intent_codes['triangle']]

@htwangtw
Copy link
Contributor Author

Hey @effigies

It makes sense to have a new name. I like agg_data and the suggested syntax. From my understanding, surface file is the only kind that has different intents for each DataArray among canonical file types.

It will help a lot to a property check for surface image. From my understanding, the workbench only reads [pointset, triangle] as the valid order. We can have a property check for both the constructor and the agg_data function. This is the only file type we need to worry about. Property of other file types might not be essential. What do you think?

I am not certain if I should call it GIFTIHeader or something else. I was mostly thinking about it as a NIFTI1Header. Header is not an offical property of GIFTI so I don't want people to be confused with the name. The problem with the GIFTI top-level MetaData is that it doesn't give you easy access for the summary of each DataArray. The information in the metadata of each DataArray holds much more useful information in my opinion. The file extension (surf.gii, func.gii, label.gii) is the only way to easily know what intent label the content might be, but not a reliable.

Another thought I had after thinking about the header class more is that not knowing the intent code is more a problem in file writing than reading. Perhaps we can focus on the agg_data and the constructor first, and then evaluate how to organise the metadata in an alternative way.

@effigies
Copy link
Member

From my understanding, surface file is the only kind that has different intents for each DataArray among canonical file types.

I think this is true from a practical perspective, but we should be prepared to handle GIFTI files with arbitrarily many ill-advised DataArray combinations.

From my understanding, the workbench only reads [pointset, triangle] as the valid order.

True, but nibabel will generally allow you to shoot yourself in the foot. I think we should make it easy to do the compatible thing, but if we go out of our way to make it hard to do silly things, there's a greater risk of disabling weird but legitimate uses.

We can have a property check for both the constructor and the agg_data function.

I'd say let's have the constructor create the DataArrays in the workbench-canonical order, which is almost certainly the FreeSurfer-canonical order as well. As an aside, it will be interesting to create things in the "wrong" order and see if FreeView and FSLeyes or other viewers interpret them correctly.

For agg_data, I think representing the data as its found is probably the least surprising move. How were you thinking to use the property check here?

Perhaps we can focus on the agg_data and the constructor first, and then evaluate how to organise the metadata in an alternative way.

👍 Sounds good. Would you care to start a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants