`ExposurePipeline` uses `meta.filename` instead of actual filename #818

braingram · 2023-08-09T15:08:42Z

I believe the changes in #802 are not compatible with the test_processing_pipeline_all_saturated regtest.

Here's a recent regtest run with this PR where only the mentioned test was run and stdout was not captured:
https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2FRoman-Developers-Pull-Requests/detail/Roman-Developers-Pull-Requests/311/pipeline/205/#step-206-log-97

A few relevant log lines are:

2023-08-08 16:45:59,578 - stpipe.ExposurePipeline - INFO - Starting Roman exposure calibration pipeline ...
2023-08-08 16:45:59,818 - stpipe.ExposurePipeline - INFO - Processing a WFI exposure <roman_datamodels.datamodels._datamodels.ScienceRawModel object at 0x7f0893db6450>
...
2023-08-08 16:51:22,126 - stpipe.ExposurePipeline - INFO - Roman exposure calibration pipeline ending...

2023-08-08 16:51:24,208 - stpipe.ExposurePipeline - INFO - Saved model in r0000101001001001001_01101_0001_WFI01_cal.asdf

Note the filename above does not match the output filename in the regression test:

romancal/romancal/regtest/test_wfi_pipeline.py

Line 438 in 0437760

output = "r0000101001001001001_01101_0001_WFI01_ALL_SATURATED_cal.asdf"

which causes the error

E                   FileNotFoundError: [Errno 2] No such file or directory: '/srv/jenkins/workspace/RT/Roman-Developers-Pull-Requests/clone/test_outputs/popen-gw1/test_processing_pipeline_all_s0/r0000101001001001001_01101_0001_WFI01_ALL_SATURATED_cal.asdf'
/srv/jenkins/workspace/RT/Roman-Developers-Pull-Requests/miniconda/envs/tmp_env0/lib/python3.11/site-packages/asdf/generic_io.py:1150: FileNotFoundError

This seems related to the input filename handling in exposure pipline where when provided with an input as a string, input is overwritten as a datamodel:

romancal/romancal/pipeline/exposure_pipeline.py

Lines 78 to 85 in 0437760

    
           file_type = filetype.check(input) 
        
           asn = None 
        
           if file_type == "asdf": 
        
               try: 
        
                   input = rdm.open(input) 
        
               except TypeError: 
        
                   log.debug("Error opening file:") 
        
                   return

then added to expos_file:

romancal/romancal/pipeline/exposure_pipeline.py

Lines 94 to 101 in 0437760

    
           # Build a list of observations to process 
        
           expos_file = [] 
        
           if file_type == "asdf": 
        
               expos_file = [input] 
        
           elif file_type == "asn": 
        
               for product in asn["products"]: 
        
                   for member in product["members"]: 
        
                       expos_file.append(member["expname"])

before this check:

romancal/romancal/pipeline/exposure_pipeline.py

Lines 105 to 109 in 0437760

    
           if isinstance(in_file, str): 
        
               input_filename = basename(in_file) 
        
               log.info(f"Input file name: {input_filename}") 
        
           else: 
        
               input_filename = None

which means that meta.filename is not overwritten here:

romancal/romancal/pipeline/exposure_pipeline.py

Lines 117 to 118 in 0437760

    
           if input_filename: 
        
               result.meta.filename = input_filename

and the meta.filename in the input file does not match the filename (r0000101001001001001_01101_0001_WFI01_ALL_SATURATED_uncal.asdf):

In [10]: af['roman']['meta']['filename']
Out[10]: 'r0000101001001001001_01101_0001_WFI01_uncal.asdf'

making the output filename incorrect.

The text was updated successfully, but these errors were encountered:

braingram · 2023-08-09T15:09:34Z

@ddavis-stsci should the truth file be updated so the meta.filename is correct or the code fixed to overwrite meta.filename?

ddavis-stsci · 2023-08-09T15:23:09Z

I'll add this to the procedure. It should not affect anything for the reg tests. The saturated file is created by hand and I'll add that to the procedure. This has been in place since May 30 and has not caused problems with the regression testing, see https://plwishmaster.stsci.edu:8081/job/RT/job/romancal/1008/ if fact the filename should be ignored with this setting ignore_asdf_paths = {'ignore': ['meta.[date, filename]

…

On 8/9/23 11:09 AM, Brett Graham wrote: @ddavis-stsci <https://urldefense.com/v3/__https://github.com/ddavis-stsci__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!wjUbk1Mq-17IckqMOWlwnJrRmJkYzATJ2PVMXUzyhWmvrH7w7FomoVJd3JJ9GCk0iBuz0kQ74X4UO80W_Etic3Sd$> should the truth file be updated so the |meta.filename| is correct or the code fixed to overwrite |meta.filename|? — Reply to this email directly, view it on GitHub <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/issues/818*issuecomment-1671595779__;Iw!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!wjUbk1Mq-17IckqMOWlwnJrRmJkYzATJ2PVMXUzyhWmvrH7w7FomoVJd3JJ9GCk0iBuz0kQ74X4UO80W_F1MKZ3L$>, or unsubscribe <https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ALXCXWLLZC75SDHOIXSYNMLXUOR3TANCNFSM6AAAAAA3KE6J4Q__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!wjUbk1Mq-17IckqMOWlwnJrRmJkYzATJ2PVMXUzyhWmvrH7w7FomoVJd3JJ9GCk0iBuz0kQ74X4UO80W_Cc0TkwO$>. You are receiving this because you were mentioned.Message ID: ***@***.***>

braingram · 2023-08-09T15:35:48Z

Thanks!

So it sounds like this is a new issue introduced by #802 and not an issue with the truth file?

To summarize, prior to #802 meta.filename was overwritten during exposure pipeline, which allowed test_processing_pipeline_all_saturated to generate a file with name r0000101001001001001_01101_0001_WFI01_ALL_SATURATED_cal.asdf matching the one expected by the regression test. After #802, meta.filename is not overwritten so the same test produces a file r0000101001001001001_01101_0001_WFI01_cal.asdf which leads to the FileNotFoundError when the regression tests attempts to open a file that doesn't exist.

ddavis-stsci · 2023-08-09T15:48:23Z

I'll add a ticket for that when the regression tests are updated. However, that change went in on Aug 8 and the tests have been failing since Aug 2. So there is something more going on.

…

On 8/9/23 11:35 AM, Brett Graham wrote: Thanks! So it sounds like this is a new issue introduced by #802 <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/pull/802__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!2U4OgYlz_h-Fq4agrzDe3XOh1xutDZhyDJHxdxHmQEZ5WwehqEMmcJbeJdIhETftAfxR7edggXxfppJuVjrILJ13$> and not an issue with the truth file? To summarize, prior to #802 <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/pull/802__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!2U4OgYlz_h-Fq4agrzDe3XOh1xutDZhyDJHxdxHmQEZ5WwehqEMmcJbeJdIhETftAfxR7edggXxfppJuVjrILJ13$> |meta.filename| was overwritten during exposure pipeline, which allowed |test_processing_pipeline_all_saturated| to generate a file with name |r0000101001001001001_01101_0001_WFI01_ALL_SATURATED_cal.asdf| matching the one expected by the regression test. After #802 <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/pull/802__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!2U4OgYlz_h-Fq4agrzDe3XOh1xutDZhyDJHxdxHmQEZ5WwehqEMmcJbeJdIhETftAfxR7edggXxfppJuVjrILJ13$>, |meta.filename| is not overwritten so the same test produces a file |r0000101001001001001_01101_0001_WFI01_cal.asdf| which leads to the |FileNotFoundError| when the regression tests attempts to open a file that doesn't exist. — Reply to this email directly, view it on GitHub <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/issues/818*issuecomment-1671640418__;Iw!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!2U4OgYlz_h-Fq4agrzDe3XOh1xutDZhyDJHxdxHmQEZ5WwehqEMmcJbeJdIhETftAfxR7edggXxfppJuVqLKu-i5$>, or unsubscribe <https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ALXCXWOUMPYD4K2GXTCTRFTXUOU55ANCNFSM6AAAAAA3KE6J4Q__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!2U4OgYlz_h-Fq4agrzDe3XOh1xutDZhyDJHxdxHmQEZ5WwehqEMmcJbeJdIhETftAfxR7edggXxfppJuVnKI_cxX$>. You are receiving this because you were mentioned.Message ID: ***@***.***>

braingram · 2023-08-09T15:58:19Z

The first instance of this test failing with FileNotFoundError I see is run 361:
https://plwishmaster.stsci.edu:8081/job/RT/job/Roman-devdeps/361/
which ran with the romancal commit including the changes from #802: df09aa5

I see prior failures (358, 359, 360) of this test but those were not due to FileNotFoundError and instead because of the rad schema changes: spacetelescope/rad#301

357 passed with 1 failure of a different test.

Is there an instance of this test failing with FileNotFoundError prior to #802?

ddavis-stsci · 2023-08-09T17:58:32Z

I've fixed the input filename and uploaded it to artifactory. I hope this error will go away. uncal = rdm.open('r0000101001001001001_01101_0001_WFI01_ALL_SATURATED_uncal.asdf') >> uncal.meta.filename 'r0000101001001001001_01101_0001_WFI01_ALL_SATURATED_uncal.asdf'

…

On 8/9/23 11:58 AM, Brett Graham wrote: The first instance of this test failing with |FileNotFoundError| I see is run 361: https://plwishmaster.stsci.edu:8081/job/RT/job/Roman-devdeps/361/ which ran with the romancal commit including the changes from #802 <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/pull/802__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!1jPb50OG_c6vCRV4R5U85F_HtFZR8fTrIhYMZj12wpQwbJNdwA4J-YJsIPy0iGKWroLloHxCbqgw2oIKUtlPkzjD$>: df09aa5 <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/commit/df09aa5bc3110bddd7ac41e689a6cd2b49a6aa90__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!1jPb50OG_c6vCRV4R5U85F_HtFZR8fTrIhYMZj12wpQwbJNdwA4J-YJsIPy0iGKWroLloHxCbqgw2oIKUqOh6XMN$> I see prior failures (358, 359, 360) of this test but those were not due to |FileNotFoundError| and instead because of the rad schema changes: spacetelescope/rad#301 <https://urldefense.com/v3/__https://github.com/spacetelescope/rad/pull/301__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!1jPb50OG_c6vCRV4R5U85F_HtFZR8fTrIhYMZj12wpQwbJNdwA4J-YJsIPy0iGKWroLloHxCbqgw2oIKUhFbC-D9$> 357 passed with 1 failure of a different test. Is there an instance of this test failing with |FileNotFoundError| prior to #802 <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/pull/802__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!1jPb50OG_c6vCRV4R5U85F_HtFZR8fTrIhYMZj12wpQwbJNdwA4J-YJsIPy0iGKWroLloHxCbqgw2oIKUtlPkzjD$>? — Reply to this email directly, view it on GitHub <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/issues/818*issuecomment-1671696292__;Iw!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!1jPb50OG_c6vCRV4R5U85F_HtFZR8fTrIhYMZj12wpQwbJNdwA4J-YJsIPy0iGKWroLloHxCbqgw2oIKUv8sz__I$>, or unsubscribe <https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ALXCXWLETHH7KG7GIWG6IA3XUOXSLANCNFSM6AAAAAA3KE6J4Q__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!1jPb50OG_c6vCRV4R5U85F_HtFZR8fTrIhYMZj12wpQwbJNdwA4J-YJsIPy0iGKWroLloHxCbqgw2oIKUs35uGMW$>. You are receiving this because you were mentioned.Message ID: ***@***.***>

braingram · 2023-08-09T18:18:17Z

Thanks for the update.
I ran just this regtest on jenkins and it's now showing a different error:
https://plwishmaster.stsci.edu:8081/job/RT/job/Roman-devdeps/364/testReport/romancal.regtest/test_wfi_pipeline/_unstable_deps__test_processing_pipeline_all_saturated/

>       assert model.meta.cal_step.linearity == "SKIPPED"
E       AssertionError: assert 'COMPLETE' == 'SKIPPED'
E         - SKIPPED
E         + COMPLETE

/srv/jenkins/workspace/RT/Roman-devdeps/clone/romancal/regtest/test_wfi_pipeline.py:455: AssertionError

Is preferring 'meta.filename' over the actual filename preferred? If so, does this match what happens when files are read from an association? Looking at the code it looks like for those files the filename would take precedence:

romancal/romancal/pipeline/exposure_pipeline.py

Lines 105 to 118 in 0437760

    
           if isinstance(in_file, str): 
        
               input_filename = basename(in_file) 
        
               log.info(f"Input file name: {input_filename}") 
        
           else: 
        
               input_filename = None 
        
           # Open the file 
        
           input = rdm.open(in_file) 
        
           log.info(f"Processing a WFI exposure {in_file}") 
        
           self.dq_init.suffix = "dq_init" 
        
           result = self.dq_init(input) 
        
           if input_filename: 
        
               result.meta.filename = input_filename

ddavis-stsci · 2023-08-09T18:35:09Z

OK, the truth file is updated as well. Note: all of these files will be updated once INS has new input files for build 11 I'll have to look more closely at the filename precedence. Part of the logic is how stpipe handles the filename and generates the output filename. That makes things more complicated.

…

On 8/9/23 2:18 PM, Brett Graham wrote: Thanks for the update. I ran just this regtest on jenkins and it's now showing a different error: https://plwishmaster.stsci.edu:8081/job/RT/job/Roman-devdeps/364/testReport/romancal.regtest/test_wfi_pipeline/_unstable_deps__test_processing_pipeline_all_saturated/ |> assert model.meta.cal_step.linearity == "SKIPPED" E AssertionError: assert 'COMPLETE' == 'SKIPPED' E - SKIPPED E + COMPLETE /srv/jenkins/workspace/RT/Roman-devdeps/clone/romancal/regtest/test_wfi_pipeline.py:455: AssertionError | Is preferring 'meta.filename' over the actual filename preferred? If so, does this match what happens when files are read from an association? Looking at the code it looks like for those files the filename would take precedence: https://github.com/spacetelescope/romancal/blob/043776069a21ee6f92f6af356293e4f4bec202bb/romancal/pipeline/exposure_pipeline.py#L105-L118 <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/blob/043776069a21ee6f92f6af356293e4f4bec202bb/romancal/pipeline/exposure_pipeline.py*L105-L118__;Iw!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!2I4RtN8czeHgj7Na5jcaBl4wYkk5TrRRnX0GkQCgb4H7l2WZHGCSDMO1SrceHvVGJwWQPVwoGleC0vMLVtPNYbkx$> — Reply to this email directly, view it on GitHub <https://urldefense.com/v3/__https://github.com/spacetelescope/romancal/issues/818*issuecomment-1671918011__;Iw!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!2I4RtN8czeHgj7Na5jcaBl4wYkk5TrRRnX0GkQCgb4H7l2WZHGCSDMO1SrceHvVGJwWQPVwoGleC0vMLVjYLfTbE$>, or unsubscribe <https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ALXCXWK7FNEQDUHDQOOJRITXUPH7JANCNFSM6AAAAAA3KE6J4Q__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!2I4RtN8czeHgj7Na5jcaBl4wYkk5TrRRnX0GkQCgb4H7l2WZHGCSDMO1SrceHvVGJwWQPVwoGleC0vMLVtJJNOHj$>. You are receiving this because you were mentioned.Message ID: ***@***.***>

ddavis-stsci · 2023-08-29T18:19:51Z

Brett you might want to close this issue. This is fixed in rcal-631

braingram · 2023-08-29T19:54:55Z

Thanks for pinging me on this.
I just tested main with r0000101001001001001_01101_0001_WFI01_uncal.asdf by running:

strun romancal.pipeline.ExposurePipeline r0000101001001001001_01101_0001_WFI01_uncal.asdf

This produces a file r0000101001001001001_01101_0001_WFI01_cal.asdf.

However, if I copy r0000101001001001001_01101_0001_WFI01_uncal.asdf to foo_uncal.asdf and then run the same command (using the copied file):

strun romancal.pipeline.ExposurePipeline foo_uncal.asdf

it produces a file with what appears to be an incorrect filename r0000101001001001001_01101_0001_WFI01_cal.asdf. With the relevant log message:

2023-08-29 15:52:01,477 - stpipe.ExposurePipeline - INFO - Saved model in r0000101001001001001_01101_0001_WFI01_cal.asdf

I will update the title of this issue to hopefully a more descriptive title.

ddavis-stsci · 2023-08-29T20:22:37Z

You cannot just change the filename you need to update meta.filename to match. So you need to copy r0000101001001001001_01101_0001_WFI01_uncal.asdf to foo_uncal.asdf
and set the meta.filename = 'foo_uncal.asdf'

This is built into stpipe and is not something romancal controls. It looks like in stpipe you can use the input filename as a template for the output filename,

    def default_output_file(self, input_file=None):
        """Create a default filename based on the input name"""
        output_file = input_file

if meta.filename is not present. I haven't played with this.
We need to use meta.filename here for the case that the input file is an association file,
e.g. r00001-o001_image_001_asn.json
we don't really want the output file to be r00001-o001_image_001_cal.json(??)

braingram · 2023-08-29T20:59:57Z

Thanks for the response.

It appears that the output filename (used by stpipe) is set here in the romancal ExposurePipeline:

romancal/romancal/pipeline/exposure_pipeline.py

Line 181 in 46ce69e

self.output_file = input.meta.filename

For a non-association run where the filename (string) is provided as an input. Prior to #802 the input filename was stored to input_filename here:

romancal/romancal/pipeline/exposure_pipeline.py

Line 69 in 0489537

input_filename = basename(input)

prior to the input being overwritten as the opened model:

romancal/romancal/pipeline/exposure_pipeline.py

Line 74 in 0489537

input = rdd.open(input)

after the model is opened input_filename is assigned to input.meta.filename which meant that the above run of foo_uncal.asdf would produce a foo_cal.asdf file.

On the current main branch (after #802) the line that assigned input_filename:

romancal/romancal/pipeline/exposure_pipeline.py

Line 73 in 46ce69e

input_filename = basename(input)

has no effect as input_filename is overwritten as None before it is used:

romancal/romancal/pipeline/exposure_pipeline.py

Line 109 in 46ce69e

input_filename = None

which results in this if conditional failing and input.meta.filename not being assigned:

romancal/romancal/pipeline/exposure_pipeline.py

Lines 117 to 118 in 46ce69e

    
           if input_filename: 
        
               result.meta.filename = input_filename

So to summarize, prior to #802 the actual filename was used, after #802 meta.filename was used. This resulted in the regression test failure that caused this issue to be opened. Looking at the code, I suspect that for files in an association the actual filename will be used and not meta.filename of the file in the association but I do not have an association to test this.

I will follow this up with a PR that should hopefully help clarify this issue.

braingram changed the title ~~#802 input_filename changes incompatible with test_processing_pipeline_all_saturated~~ ExposurePipeline uses meta.filename instead of actual filename Aug 29, 2023

braingram mentioned this issue Aug 29, 2023

use filename instead of meta.filename for exposure pipeline #850

Closed

5 tasks

braingram mentioned this issue Jun 3, 2024

group_id in romancal #1259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ExposurePipeline` uses `meta.filename` instead of actual filename #818

`ExposurePipeline` uses `meta.filename` instead of actual filename #818

braingram commented Aug 9, 2023

braingram commented Aug 9, 2023

ddavis-stsci commented Aug 9, 2023 via email

braingram commented Aug 9, 2023

ddavis-stsci commented Aug 9, 2023 via email

braingram commented Aug 9, 2023

ddavis-stsci commented Aug 9, 2023 via email

braingram commented Aug 9, 2023

ddavis-stsci commented Aug 9, 2023 via email

ddavis-stsci commented Aug 29, 2023

braingram commented Aug 29, 2023

ddavis-stsci commented Aug 29, 2023

braingram commented Aug 29, 2023

ExposurePipeline uses meta.filename instead of actual filename #818

ExposurePipeline uses meta.filename instead of actual filename #818

Comments

braingram commented Aug 9, 2023

braingram commented Aug 9, 2023

ddavis-stsci commented Aug 9, 2023 via email

braingram commented Aug 9, 2023

ddavis-stsci commented Aug 9, 2023 via email

braingram commented Aug 9, 2023

ddavis-stsci commented Aug 9, 2023 via email

braingram commented Aug 9, 2023

ddavis-stsci commented Aug 9, 2023 via email

ddavis-stsci commented Aug 29, 2023

braingram commented Aug 29, 2023

ddavis-stsci commented Aug 29, 2023

braingram commented Aug 29, 2023

`ExposurePipeline` uses `meta.filename` instead of actual filename #818

`ExposurePipeline` uses `meta.filename` instead of actual filename #818