Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bf unconforming where in coupling routines #17

Merged
merged 3 commits into from
Oct 22, 2020

Conversation

ukmo-juan-castillo
Copy link

Bug fix in where sentences present in coupling routines, which generate out of bounds runtime errors. These changes have been tested offline and the runtime errors disappear. The coupling regtests are different from the reference, which indicates that this bug generates errors in the results of coupled configurations. This is supported by the fact that the OASICM regtest, which is not influenced by this bug, passes without problems.

I do not copy the results of the regtests because they are quite large. It would be convenient to merge these changes to the staging_oc20 branch before implementing further development.

@ukmo-juan-castillo
Copy link
Author

Please have a look to issue 259 for more details:

NOAA-EMC#259

@mickaelaccensi
Copy link
Collaborator

thanks Juan for the bugfix,
it's even weird that it was working almost correctly by send the full array content by each mpi proc..
could you add the matrix comparison to see the differences ?

@ukmo-juan-castillo
Copy link
Author

thanks Juan for the bugfix,
it's even weird that it was working almost correctly by send the full array content by each mpi proc..
could you add the matrix comparison to see the differences ?

Yes, with this type of bugs the result is completely unpredictable.

I attach the matrix comparison file but not the diff, as the diff file is 13Mb even when compressed at maximum level.
matrixCompSummary.out-ww3_tp2.14.txt

@ukmo-juan-castillo
Copy link
Author

I just copy below the differences for OASICM for reference:


/scratch/d04/jcastill/WWIII-where-bug/regtests/bin/output/ww3_tp2.14/work_OASICM/rmp_ww3t_to_toyt_DISTWGT.nc_diff.txt


50c50
< :history = "Created: 10-19-2020" ;

          :history = "Created: 10-17-2020" ;

/scratch/d04/jcastill/WWIII-where-bug/regtests/bin/output/ww3_tp2.14/work_OASICM/rmp_toyt_to_ww3t_DISTWGT.nc_diff.txt


50c50
< :history = "Created: 10-19-2020" ;

          :history = "Created: 10-17-2020" ;

@mickaelaccensi
Copy link
Collaborator

I just copy below the differences for OASICM for reference:

/scratch/d04/jcastill/WWIII-where-bug/regtests/bin/output/ww3_tp2.14/work_OASICM/rmp_ww3t_to_toyt_DISTWGT.nc_diff.txt

50c50

< :history = "Created: 10-19-2020" ;

          :history = "Created: 10-17-2020" ;

/scratch/d04/jcastill/WWIII-where-bug/regtests/bin/output/ww3_tp2.14/work_OASICM/rmp_toyt_to_ww3t_DISTWGT.nc_diff.txt

50c50

< :history = "Created: 10-19-2020" ;

          :history = "Created: 10-17-2020" ;

I cannot see the full difference, could you add it again ?

@ukmo-juan-castillo
Copy link
Author

I just copy below the differences for OASICM for reference:
/scratch/d04/jcastill/WWIII-where-bug/regtests/bin/output/ww3_tp2.14/work_OASICM/rmp_ww3t_to_toyt_DISTWGT.nc_diff.txt

50c50

< :history = "Created: 10-19-2020" ;

          :history = "Created: 10-17-2020" ;

/scratch/d04/jcastill/WWIII-where-bug/regtests/bin/output/ww3_tp2.14/work_OASICM/rmp_toyt_to_ww3t_DISTWGT.nc_diff.txt

50c50

< :history = "Created: 10-19-2020" ;

          :history = "Created: 10-17-2020" ;

I cannot see the full difference, could you add it again ?

Yes, everything seems scrambled because of the git formatting, I will just attach one file with the information.
matrixDiff.out-ww3_tp2.14-OASICM.txt

@mickaelaccensi
Copy link
Collaborator

it just contains the differences for the rmp file, do you also have for the other files ?

@ukmo-juan-castillo
Copy link
Author

it just contains the differences for the rmp file, do you also have for the other files ?

Those are the only differences for OASICM, the differences for the rest of the tests are too large to be copied... well, I will try in different files for OASACM, OASOCM (I have similar differences for OASACMX:

matrixDiff.out-ww3_tp2.14-OASACM.txt
matrixDiff.out-ww3_tp2.14-OASOCM.txt

@mickaelaccensi
Copy link
Collaborator

well I've looked at the code again and actually I think we should keep NSEAL instead of NSEALM, which is the maximum number of local sea point per mpi proc. BTY, it's not correct to assign the full array to TMP array since they are not allocated over the same number of elements
we should rather set it explicitely :
WHERE(CHARN /= UNDEF) TMP(1:NSEAL)=CHARN(1:NSEAL)
instead of
WHERE(CHARN /= UNDEF) TMP=CHARN

You can see that the DO loop is already used to assign the water velocity fields in w3ogcmmd, which was your option1

@ukmo-juan-castillo
Copy link
Author

well I've looked at the code again and actually I think we should keep NSEAL instead of NSEALM, which is the maximum number of local sea point per mpi proc. BTY, it's not correct to assign the full array to TMP array since they are not allocated over the same number of elements
we should rather set it explicitely :
WHERE(CHARN /= UNDEF) TMP(1:NSEAL)=CHARN(1:NSEAL)
instead of
WHERE(CHARN /= UNDEF) TMP=CHARN

You can see that the DO loop is already used to assign the water velocity fields in w3ogcmmd, which was your option1

OK, I can do it that way, but I believe that the array index have to be specified in the WHERE sentence, as they generate the mask that will be applied to the operation. I will make the changes, test them, and send the code back to you.

@mickaelaccensi
Copy link
Collaborator

ok you mean doing something more like that :
WHERE(CHARN(1:NSEAL) /= UNDEF) TMP(1:NSEAL)=CHARN(1:NSEAL)

@ukmo-juan-castillo
Copy link
Author

ok you mean doing something more like that :
WHERE(CHARN(1:NSEAL) /= UNDEF) TMP(1:NSEAL)=CHARN(1:NSEAL)

Exactly - what is inside WHERE() specifies the mask for the operation, and it the dimension of the mask is different than the dimension of the parameters taking part of the operation, we can have a problem again.

@ukmo-juan-castillo
Copy link
Author

I made the changes suggested. The runtime error still disappears, and this time the regtests are more similar to the reference.

matrixCompSummary.out-ww3_tp2.14.txt
matrixDiff.out-ww3_tp2.14.txt

@@ -132,7 +132,7 @@
! Ocean sea surface current (m.s-1) (v-component)
! ---------------------------------------------------------------------
IF (SND_FLD(IB_DO)%CL_FIELD_NAME == 'WW3_WSSV') THEN
TMP(1:NSEALM) = 0.0
TMP(1:NSEAL) = 0.0
DO JSEA=1, NSEAL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not even sure this section works well since TMP array is filled after the test on UNDEF
Maybe you should also rewrite it as :
WHERE(CY(1:NSEAL) /= UNDEF) TMP(1:NSEAL)=CY(1:NSEAL)

@@ -120,7 +120,7 @@
! Ocean sea surface current (m.s-1) (u-component)
! ---------------------------------------------------------------------
IF (SND_FLD(IB_DO)%CL_FIELD_NAME == 'WW3_WSSU') THEN
TMP(1:NSEALM) = 0.0
TMP(1:NSEAL) = 0.0
DO JSEA=1, NSEAL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you should also rewrite it as :
WHERE(CX(1:NSEAL) /= UNDEF) TMP(1:NSEAL)=CX(1:NSEAL)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you should also rewrite it as :
WHERE(CX(1:NSEAL) /= UNDEF) TMP(1:NSEAL)=CX(1:NSEAL)

I do not think a WHERE sentence will work in this case, as CX and CY are vectors of dimension 1:NSEA. The reason why the loop is done explicitly is to find the right index for this vector for a given processor:

ISEA=IAPROC+(JSEA-1)*NAPROC

Notice that the same approach is followed for the variable ICEF in w3igcmmd.f90, which also has dimensions 1:NSEA.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right ! BTY the line 139 in w3agcmd.ftn since it force TMP to be filled by CY after the test on UNDEF

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right ! BTY the line 139 in w3agcmd.ftn since it force TMP to be filled by CY after the test on UNDEF

Agree, that line should be deleted.

@mickaelaccensi
Copy link
Collaborator

could you also remove the history attribute in the rmp files to not always have differences in those files. We still need compare them in case the data inside would be different

it can be done by commenting lines 314 and 993 in regtests/ww3_tp2.14/input/oasis3-mct/lib/scrip/src/remap_write.f

@ukmo-juan-castillo
Copy link
Author

could you also remove the history attribute in the rmp files to not always have differences in those files. We still need compare them in case the data inside would be different

it can be done by commenting lines 314 and 993 in regtests/ww3_tp2.14/input/oasis3-mct/lib/scrip/src/remap_write.f

Instead of removing the history attribute from the file by modifying the code, I will rather do it (just for rmp_ files) in the matrix.com script, as it will work even if the oasis code is updated at some point.

oasis rmp files; remove unnecesary line of code.
@ukmo-juan-castillo
Copy link
Author

Thank you very much for the quick review and more than useful comments!

Copy link
Member

@ukmo-ccbunney ukmo-ccbunney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes all look good. Thanks @ukmo-juan-castillo and @mickaelaccensi .

@ukmo-ccbunney ukmo-ccbunney merged commit e698b48 into staging_oct20 Oct 22, 2020
ukmo-ccbunney pushed a commit that referenced this pull request Oct 28, 2020
* Fix non-conforming WHERE statements in coupled routines
* Stop comparing history lines in OASIS rmp files
ukmo-ccbunney added a commit that referenced this pull request Nov 5, 2020
* Fb oasis t+0 (#13)

In the original code a coupling lag had to be used, coupling took place at the last time step but not at the first, and the coupling fields had to be written in an oasis restart file. With this ticket, the program can run in a different way by not using a coupling lag, coupling taking place at the first time step but not the last, and the coupling fields are written in the wave restart file - no extra oasis restart file needed.

* Fb uprstr inp (#15)

Enhancement to improve the way in which input data is read in and logged by the ww3_uprstr program:
  * read in variables specific to the update process selected
  * output the values provided in the ww3_uprstr.out log file
  * update the .inp template file and regtests to improve clarity and work with the changes
  * add capability to read inputs from a namelist (ww3_uprstr.nml) file

* Bf unconforming where in coupling routines (#17)

* Fix non-conforming WHERE statements in coupled routines
* Stop comparing history lines in OASIS rmp files

* Changes for efficient SMC grid coupling (#16)

* Changes for efficient SMC grid coupling
* Ensure consistency between SMC coupled test nml and inp files

Co-authored-by: Juan Manuel Castillo Sanchez <48921434+ukmo-juan-castillo@users.noreply.github.com>
Co-authored-by: Andy Saulter <48921142+ukmo-ansaulter@users.noreply.github.com>
@ukmo-ccbunney ukmo-ccbunney deleted the bf_unconforming_where branch February 24, 2021 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants