translator for specreduce Trace objects #72

kecnry · 2022-07-20T19:47:46Z

This PR implements a new translator for specreduce Trace objects. In doing so, it adds specreduce as a dependency. In order to round-trip and retrieve the same underyling Trace object (instead of always having to export to an ArrayTrace), this currently saves a copy of the input Trace object in the meta of the glue Data object. If this is not preferred, we could create an ArrayTrace from the glue Data object, but we'll likely still need to attach an instance or reference to the input data.

dhomeier · 2022-07-22T17:12:52Z

Do you think you could set up a test for this with a basic example Data set?

dhomeier · 2022-07-26T13:24:34Z

Thanks for this contribution, this looks all quite good to me, although I am not familiar enough with the Trace format to say if this is the best possible implementation.
It certainly deserves a changelog entry as well, perhaps under 0.5.0 (unreleased) for a new feature.

The main points I see remaining to be settled are

Adding specreduce to requirements; with specutils and spectral-cube already runtime dependencies and specreduce now an Astropy-coordinated package as well, I don't see much problems with that.
As you mentioned, adding the full trace to the meta data might be controversial. As I see it at the moment you are storing a full copy in data.meta['Trace'], while data.meta['Trace'].trace is at the same time kept in data['y'] (this might additionally be checked in the tests BTW). Wondering if it would be preferable to strip the trace itself from the meta copy and add it back in later in to_object to save some space, but I have no idea if it's worth the effort at all for typical sizes of these objects – I guess they are usually (n-1)-D, so not that large? Possibly there is not even a real memory impact if this is a shallow copy.

Leaving those open for discussion with @astrofrog.

glue_astronomy/translators/tests/test_trace.py

eteq

I agree with @dhomeier that the dependency doesn't seem a big deal given the status of specreduce now.

I also agree that the memory impact is not serious given that, from my read of the internals of Trace and its sub-classes, both obj.trace and obj.image are references not copies, and this is just carrying along those references. As long as glue itself does not do any copying later on down the line this is probably "free".

eteq · 2022-07-29T18:32:31Z

glue_astronomy/translators/trace.py

+        obj : `specreduce.tracing.Trace`
+            The Trace object to convert
+        """
+        data = Data(x=obj.image[0], y=obj.trace)


This is implicitly assuming that the 0th dimension is the "spectral"/"x" dimension. I think that's also what specreduce is doing, so that's probably fine right now. But I think it would be better if this is a property of the Trace object that indicates which is the "x" axis.

Maybe I should create an issue in specreduce about this and you can just add a "TODO: change when issue #xxx is addressed" or something?

Yes, there are some places in specreduce with general code to handle arbitrary "dispersion_axis", other parts where it's fixed at 0 but stored, and others where it's just assumed to be 0. I agree that this is a specreduce issue that needs to be addressed consistently across all methods, and then generalize back here when that is supported.

Yes, there are some places in specreduce with general code to handle arbitrary "dispersion_axis", other parts where it's fixed at 0 but stored, and others where it's just assumed to be 0. I agree that this is a specreduce issue

What I can see in the different Trace subclasses is using the same throughout (actually axis 1, which is the dimension that you get from obj.image[0]); KosmosTrace has _disp_axis and _crossdisp_axis properties, but currently fixes them at 1 and 0, respectively. Might make sense to check for the presence of obj._disp_axis and assume 1 otherwise, though.

More importantly however I realised that I had confused axes lengths, values and dimensions here, and am now wondering why you'd want obj.image[0] as component 'x' here at all. This is just a (somewhat arbitrary?) row of count values in the detector image, so really bears no particular information on the dispersion or spectral axis, right?
You could construct a pixel count or similar from that, but Data already gives you that automatically in data[data.pixel_component_ids[0]].
Of course if that pixel row is required in your application it can be included by the importer, but then the documentation should add an explanation why this component is added. My impression now is that the most natural structure would be to have only obj.trace as a component (and perhaps also name it 'trace' rather than 'y', which seems more self-explanatory).

That's a good point, I removed the x-component entirely and that still seems to do what is needed (without needing to worry or document anything about the dispersion axes). Please check to see if this is what you had in mind or if you can think of any scenario this wouldn't work.

Thanks; I think this is the most intuitive structure. About potential use scenarios I cannot say much, but in any case if you needed any of the actual image data, you'd still have access through data.meta['Trace'].
Could you rebase; this will hopefully fix the previous test failures?

rebased, but I don't have permissions to rerun the CI workflows. Thanks!

eteq · 2022-07-29T18:37:08Z

glue_astronomy/translators/trace.py

+        if not isinstance(data.meta.get('Trace'), Trace):
+            raise TypeError("data is not of a valid specreduce Trace object")
+
+        return data.meta['Trace']


This worries me, because it is completely ignoring what's actually in the data object itself. I think that's counter-intuitive because a likely eventual workflow is to change the y trace by hand, which is probably most naturally done in glue.

So I would suggest the following:

If there's some way that glue data objects can flag "have been modified since created", only do the next step if that one is true.

Do something like:

Suggested change

return data.meta['Trace']

data.meta['Trace'].trace = data.y

return data.meta['Trace']

Now this does introduce an assymetry between x and y. But for a trace I think that makes sense.

Regardless, I think either in to_object or the top-level class, the docstring should be explcit about whether it gets data from glue to create the object.

I like this idea. We can even include a check to make sure that x has not changed and that y is the correct length. In cases where y has changed, we'll probably want to return an ArrayTrace with the modified trace.y. For example: if the input trace object is a FlatTrace, but then the underlying data is modified once in glue, it no longer (necessarily) makes sense to return another FlatTrace object, right?

In theory, I don't see why Trace needs the input image, it's just convenient to have attached, so if we can make that more flexible on the specreduce end, then we might be able to rebuild the object from scratch instead of relying on these references. But I think in cases where the data isn't modified, it is nice to roundtrip and return the same type of trace, which at least involves storing some more information than the arrays themselves.

Update: I now check to see if the "y"-component matches the trace, and if not, create a new ArrayTrace object with the same image. It turns out that glue does not allow changing the shape of data components, so there is no need for any checks on the lengths, but I do raise an error if the "x"-component no longer matches the dispersion axis of the image.

I also agree that the memory impact is not serious given that, from my read of the internals of Trace and its sub-classes, both obj.trace and obj.image are references not copies, and this is just carrying along those references. As long as glue itself does not do any copying later on down the line this is probably "free".

Any data component initialising or being added to a data object is processed by Component.autotyped(), which will typically involve np.asarray(data) and potentially other re-casting operations. So for most ndarray inputs there will probably be no actual copying, but even if, this only involves obj.trace. The meta reference is just an OrderedDict item.

If there's some way that glue data objects can flag "have been modified since created", only do the next step if that one is true.

Wondering if the callback mechanism might be used for this, although I do not see an example where this is applied in a Data component so far, apart from the use in _update_world_components.

eteq · 2022-07-29T18:38:06Z

(Oh, and my "If there's some way that glue data objects can flag "have been modified since created", only do the next step if that one is true." comment could use some insight from @astrofrog if @dhomeier doesn't know the answer!)

additional assert statement in test Co-authored-by: Derek Homeier <dhomeie@gwdg.de>

* and raise error if "x" component is changed

* can rely on the automatic pixel component and recover everything from the original trace object

codecov · 2022-08-12T16:45:44Z

Codecov Report

Merging #72 (aec7ee3) into main (bae4ecf) will decrease coverage by 0.16%.
The diff coverage is 92.68%.

❗ Current head aec7ee3 differs from pull request most recent head 7ce7142. Consider uploading reports for the commit 7ce7142 to get more accurate results

@@            Coverage Diff             @@
##             main      #72      +/-   ##
==========================================
- Coverage   97.63%   97.46%   -0.17%     
==========================================
  Files          15       17       +2     
  Lines        1182     1223      +41     
==========================================
+ Hits         1154     1192      +38     
- Misses         28       31       +3

Impacted Files	Coverage Δ
glue_astronomy/translators/trace.py	`85.71% <85.71%> (ø)`
glue_astronomy/translators/__init__.py	`100.00% <100.00%> (ø)`
glue_astronomy/translators/tests/test_trace.py	`100.00% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

dhomeier

Apart from the minor style comments this should be good to go, thanks again!

@eteq I think your change requests are adequately addressed with this (the question which axis is dispersion in fact no longer arises)?
@astrofrog the only potentially contentious issue I see remaining is the specreduce dependency, but we had a clear majority in favour of that.

glue_astronomy/translators/trace.py

dhomeier · 2022-08-12T17:59:11Z

Oh, one last thing: please add a changelog entry, and change the current section heading to 0.5.0 (unreleased)! :)

dhomeier · 2022-08-12T19:56:11Z

Successful test run, so considering the last commit a [ci skip].

eteq · 2022-08-17T18:41:06Z

sorry I didn't get back to this sooner, by a a post-hoc "yes this is good, thanks!"

kecnry mentioned this pull request Jul 20, 2022

Plugin: spectral extraction spacetelescope/jdaviz#1514

Merged

26 tasks

dhomeier added the enhancement New feature or request label Jul 22, 2022

dhomeier reviewed Jul 26, 2022

View reviewed changes

glue_astronomy/translators/tests/test_trace.py Show resolved Hide resolved

eteq requested changes Jul 29, 2022

View reviewed changes

kecnry mentioned this pull request Aug 9, 2022

spectral-extraction: support import/export trace via glue-astronomy spacetelescope/jdaviz#1556

Merged

11 tasks

dhomeier mentioned this pull request Aug 11, 2022

Export roi.theta in 'astropy-regions' translator and run passing tests #73

Merged

kecnry and others added 6 commits August 12, 2022 12:39

translator for specreduce Trace objects

10ad8cf

basic test coverage and docs listing

1e06c7d

Apply suggestions from code review

31a0ed4

additional assert statement in test Co-authored-by: Derek Homeier <dhomeie@gwdg.de>

export to ArrayTrace with new trace if "y" component changed

32620c3

* and raise error if "x" component is changed

rename "y" component to "trace"

c3ae01a

remove x-component

46c65e5

* can rely on the automatic pixel component and recover everything from the original trace object

kecnry force-pushed the translator-specreduce-trace branch from 6680612 to 46c65e5 Compare August 12, 2022 16:39

dhomeier requested changes Aug 12, 2022

View reviewed changes

glue_astronomy/translators/trace.py Outdated Show resolved Hide resolved

glue_astronomy/translators/trace.py Outdated Show resolved Hide resolved

improve error messages

aec7ee3

dhomeier approved these changes Aug 12, 2022

View reviewed changes

changelog entry

7ce7142

dhomeier merged commit d245375 into glue-viz:main Aug 17, 2022

kecnry deleted the translator-specreduce-trace branch August 17, 2022 18:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

translator for specreduce Trace objects #72

translator for specreduce Trace objects #72

kecnry commented Jul 20, 2022

dhomeier commented Jul 22, 2022

dhomeier commented Jul 26, 2022 •

edited

Loading

eteq left a comment

eteq Jul 29, 2022

kecnry Jul 29, 2022

dhomeier Aug 5, 2022 •

edited

Loading

kecnry Aug 11, 2022

dhomeier Aug 12, 2022

kecnry Aug 12, 2022

eteq Jul 29, 2022

kecnry Jul 29, 2022

kecnry Jul 29, 2022

dhomeier Aug 1, 2022

eteq commented Jul 29, 2022

codecov bot commented Aug 12, 2022 •

edited

Loading

dhomeier left a comment •

edited

Loading

dhomeier commented Aug 12, 2022

dhomeier commented Aug 12, 2022

eteq commented Aug 17, 2022

	return data.meta['Trace']
	data.meta['Trace'].trace = data.y
	return data.meta['Trace']

translator for specreduce Trace objects #72

translator for specreduce Trace objects #72

Conversation

kecnry commented Jul 20, 2022

dhomeier commented Jul 22, 2022

dhomeier commented Jul 26, 2022 • edited Loading

eteq left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhomeier Aug 5, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eteq commented Jul 29, 2022

codecov bot commented Aug 12, 2022 • edited Loading

Codecov Report

dhomeier left a comment • edited Loading

Choose a reason for hiding this comment

dhomeier commented Aug 12, 2022

dhomeier commented Aug 12, 2022

eteq commented Aug 17, 2022

dhomeier commented Jul 26, 2022 •

edited

Loading

dhomeier Aug 5, 2022 •

edited

Loading

codecov bot commented Aug 12, 2022 •

edited

Loading

dhomeier left a comment •

edited

Loading