Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

negative return value of io_wallclocktime #266

Merged
merged 3 commits into from
Apr 20, 2019

Conversation

qiaojunfeng
Copy link
Collaborator

I have written a subroutine berry_print_progress (line 2005 of https://github.com/qiaojunfeng/wannier90/blob/shc/src/postw90/berry.F90) to estimate calculation progress when calculating spin Hall conductivity. This subroutine calls io_wallclocktime in src/io.F90.

In some situations, it will print

 -------------------------------
   k-points       wall      diff
  calculated      time      time
  ----------      ----      ----
       0%          0.0       0.0
      10%       7249.5    7249.5
      20%      14651.6    7402.1
      30%    -193063.5 -207715.1
      40%    -185758.8    7304.7
      50%    -178824.8    6934.1
      60%    -172037.3    6787.4
      70%    -164733.6    7303.7
      80%    -157577.6    7156.1
      90%    -150888.8    6688.8
     100%    -143694.7    7194.1

The 2nd column is the value returned from io_wallclocktime.

The negative value is caused by insufficient precision of integer in io_wallclocktime.
If using the default kind = 4 integer in system_clock, the count_max will be 2147483647, so the return value of system_clock will wrap around about 25 days or 2.5 days depending on the compiler.

Refs:

  1. https://gcc.gnu.org/onlinedocs/gfortran/SYSTEM_005fCLOCK.html
  2. https://stackoverflow.com/questions/46637148/do-i-use-fortran-system-clock-correctly

I have created a test case:

program abc
implicit none

integer,parameter :: i64 = selected_int_kind(15)
integer :: c0, c1, count_max,rate
integer(i64) :: c0b, c1b, count_maxb,rateb

write(*,*) i64

call system_clock(c0, rate, count_max=count_max)
write(*,*) c0,' ', rate, ' ', count_max
write(*,*) 'max days = ', real(count_max)/real(rate)/real(3600*24.0)

call system_clock(c0b, rateb, count_max=count_maxb)
write(*,*) c0b,' ', rateb, ' ', count_maxb
write(*,*) 'max years = ', real(count_maxb)/real(rateb)/real(3600*24.0*365)

end program abc

Test on machine 1:

$ ifort -v
ifort version 13.0.0
$ ifort a.f90
$ ./a.out
           8
   533120306         10000    2147483647
 max days =    2.485513
      1555473076030677                 1000000     9223372036854775807
 max years =    292471.2
$ gfortran -v
...
gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC)
$ gfortran a.f90
$ ./a.out
           8
   694973398          1000    2147483647
 max days =    24.855137
        1555473134550                   1000    9223372036854775807
 max years =   2.92471232E+08

Test on machine 2:

> ifort -v
ifort version 19.0.0.117
> ifort a.f90
> ./a.out
           8
   534314646         10000    2147483647
 max days =    2.485513
      1555473195464636                 1000000     9223372036854775807
 max years =    292471.2
> gfortran -v
...
gcc version 8.2.1 20190103 [gcc-8-branch revision 267549] (SUSE Linux)
> gfortran a.f90
> ./a.out
           8
   524896202          1000    2147483647
 max days =    24.8551369
      524896202374217             1000000000    9223372036854775807
 max years =    292.471191

Test on machine 3:

$ gfortran -v
...
gcc version 8.2.1 20181127 (GCC)
$ gfortran a.f90
$ ./a.out
           8
   526333427          1000    2147483647
 max days =    24.8551369
      526333428595970             1000000000    9223372036854775807
 max years =    292.471191

So, the wrap around period of Intel compiler is 2.5 days, while that of gfortran is 25 days.
I used Intel compiler and that's why I got negative time in the calculation: the system encountered a wrap around during the calculation (note the calculation finished in a day, did not exceed 2.5 days).

In the commit, I have used use kind = 8 integer in io_wallclocktime. Though not totally get rid of wrap around error, this should alleviate the problem to a large extent.

@codecov
Copy link

codecov bot commented Apr 17, 2019

Codecov Report

Merging #266 into develop will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #266   +/-   ##
========================================
  Coverage    63.34%   63.34%           
========================================
  Files           29       29           
  Lines        17979    17979           
========================================
  Hits         11389    11389           
  Misses        6590     6590
Impacted Files Coverage Δ
src/io.F90 70.34% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35cad1f...af4f3c0. Read the comment docs.

@jryates
Copy link
Member

jryates commented Apr 18, 2019

I have a slight nervousness about introducing 64 bit integers in the code. Probably it is ok - but I would like @mostofi @giovannipizzi @VVitale to approve this too.

An alternative fix would be for the clock to check for wrap around and adjust (by adding on the repeat period). Of course that is not perfect, as you might have two or more such events between calls to the clock.

@giovannipizzi
Copy link
Member

@jryates I see your doubts, but anyway this is only defined internally in the code and the only routine getting the variable is system_clock, that by specification should be able to accept different lengths of integers, at least on gfortran and ifort.

So I think this should be ok (and moreover on the test farm it's going to be tested also on other compilers).

@jryates jryates merged commit cd676ad into wannier-developers:develop Apr 20, 2019
@qiaojunfeng qiaojunfeng deleted the io_wallclocktime branch January 23, 2020 10:42
manxkim pushed a commit to manxkim/wannier90 that referenced this pull request Jan 10, 2021
…cktime

negative return value of io_wallclocktime
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants