E3SM-kernels #1416

kiranchandramohan · 2022-01-26T17:16:31Z

Enable building and execution of E3SM kernels with the flang compiler.

AlexisPerry · 2022-02-01T20:23:53Z

The following is a reproducer for a segmentation fault that I get when running the atmosphere kernel. There are a number of comments inline about changes that make the segfault disappear (prefixed with !NOTE). Any suggestions for where to look to fix this? I'm compiling with flang-new -O3 -ffixed-line-length=none -L<path to libpgmath> -lpgmath

!!! UTILS MODULE
module kernel_utils
  implicit none
  public
  integer , parameter :: np = 4 !NOTE: removing "parameter" fixes segfault

  integer :: nets = 1
  integer :: nete = 16

  real, allocatable :: qtens(:,:,:,:)

contains
  subroutine init(n,a)
    implicit none
    integer , intent(in   ) :: n
    real    , intent(  out) :: a(n)
    integer :: i

    do i = 1 , n
      a(i) = 1.0
    enddo
  end subroutine init
end module kernel_utils

!!! COMPUTE MODULE
module subrt_cpu
  use kernel_utils
  implicit none
  private

  public :: subrt

contains
  function fxn(s) result(laplace)
    real, intent(in   ) :: s(np)
    real :: laplace(np)
    laplace = s !NOTE: not using s fixes the segfault
  end function fxn

  subroutine subrt(qtens)
    real , intent(inout) :: qtens(np,72,40,nets:nete) !NOTE: removing nets:nete fixes segfault
    integer :: ie
    do ie = 1 , 16 !NOTE: removing this loop fixes the segfault
    !NOTE: with loop removed, segfault occurs for ie > 5
    !NOTE: regardless of loop, removing the call to fxn fixes the segfault
    !NOTE: regardless of loop, removing any dimesion of the array fixes segfault (i.e. need at least 4D array)
        qtens(:,1,1,ie) = fxn(qtens(:,1,1,ie))
    enddo
  end subroutine subrt
end module subrt_cpu

!!! MAIN
program subrt_kernel
  use kernel_utils
  use subrt_cpu
  implicit none

  allocate(qtens(np, 72, 40, 16))
  call init( product(shape(qtens)) , qtens ) !Is this not actually initing things properly? i.e. misshapen arrays?
  call subrt(qtens)
end program subrt_kernel

jeanPerier · 2022-02-02T12:15:22Z

Hi @AlexisPerry, I think there is something wrong with the base address computation of qtens(:,1,1,ie). I made a small FIR repro below that is supposed to print the address of qtens(:,1,1,ie) with ie=2 assuming qtens base address is zero, and it prints 1843200 while I would have expect it to print 4*72*40*(2-1)*sizeof(f32)=46080.

So if you want to look into it, it might be worth looking at fir.embox codegen base address compuation (here)

Here is my small small FIR repro that you can with tco+llc and link with Fortran runtime:

func @print_addr(%arg0: !fir.ref<!fir.array<?xf32>>) {
  %c1 = arith.constant 1 : index
  %c-1_i32 = arith.constant -1 : i32
  %2 = fir.address_of(@_QQcl.2E2F726570726F2E66393000) : !fir.ref<!fir.char<1,12>>
  %3 = fir.convert %2 : (!fir.ref<!fir.char<1,12>>) -> !fir.ref<i8>
  %c28_i32 = arith.constant 28 : i32
  %4 = fir.call @_FortranAioBeginExternalListOutput(%c-1_i32, %3, %c28_i32) : (i32, !fir.ref<i8>, i32) -> !fir.ref<i8>
  %12 = fir.convert %arg0 : (!fir.ref<!fir.array<?xf32>>) -> i64
  %13 = fir.call @_FortranAioOutputInteger64(%4, %12) : (!fir.ref<i8>, i64) -> i1
  %14 = fir.call @_FortranAioEndIoStatement(%4) : (!fir.ref<i8>) -> i32
  return
}

func @test(%arg0: !fir.ref<!fir.array<4x72x40x?xf32>>, %nets: index, %nete: index, %at : index) {
    %c1 = arith.constant 1 : index
    %c4 = arith.constant 4 : index
    %c7 = arith.constant 7 : index
    %c40 = arith.constant 40 : index
    %undef = fir.undefined index
    %41 = fir.shape_shift %c1, %c4, %c1, %c7, %c1, %c40, %nets, %nete : (index, index, index, index, index, index, index, index) -> !fir.shapeshift<4>
    %42 = fir.slice %c1, %c7, %c1, %c1, %undef, %undef, %c1, %undef, %undef, %at, %undef, %undef : (index, index, index, index, index, index, index, index, index, index, index, index) -> !fir.slice<4>
    %43 = fir.embox %arg0(%41) [%42] : (!fir.ref<!fir.array<4x72x40x?xf32>>, !fir.shapeshift<4>, !fir.slice<4>) -> !fir.box<!fir.array<?xf32>>
    %44 = fir.box_addr %43 : (!fir.box<!fir.array<?xf32>>) -> !fir.ref<!fir.array<?xf32>>
    fir.call @print_addr(%44) : (!fir.ref<!fir.array<?xf32>>) -> ()
    return
}
func @_QQmain() {
  %c0 = arith.constant 0 : i64
  %nets = arith.constant 1 : index
  %nete = arith.constant 16 : index
  %at = arith.constant 2 : index
  %nullptr = fir.convert %c0 : (i64) -> !fir.ref<!fir.array<4x72x40x?xf32>>
  fir.call @test(%nullptr, %nets, %nete, %at) : (!fir.ref<!fir.array<4x72x40x?xf32>>, index, index, index) -> ()
  return
}

func private @_FortranAioBeginExternalListOutput(i32, !fir.ref<i8>, i32) -> !fir.ref<i8> attributes {fir.io, fir.runtime}
func private @_FortranAioOutputInteger64(!fir.ref<i8>, i64) -> i1 attributes {fir.io, fir.runtime}
func private @_FortranAioEndIoStatement(!fir.ref<i8>) -> i32 attributes {fir.io, fir.runtime}
fir.global linkonce @_QQcl.2E2F726570726F2E66393000 constant : !fir.char<1,12> {
  %0 = fir.string_lit "./repro.f90\00"(12) : !fir.char<1,12>
  fir.has_value %0 : !fir.char<1,12>
}

I suspect something is wrong with the fir.embox in @test function. Assuming the mlir file is called test.mlir, you can compile and run with:

llvmbuild= <your llvm build>
llvmsrc= <your llvm src>


filename=test
firFile=$filename.mlir
llFile=$filename.mlir.ll
objFile=$filename.mlir.o
exeFile=$filename


$llvmbuild/bin/tco -o $llFile $firFile
$llvmbuild/bin/llc --relocation-model=pic --filetype=obj $llFile
gcc -I$llvmsrc/flang/include $llvmsrc/flang/runtime/Fortran_main.c -fPIC -c
gcc Fortran_main.o $objFile -L$llvmbuild/lib -lFortranRuntime -lFortranDecimal -L$pgmathPath -lpgmath -lm -o $exeFile

jeanPerier · 2022-02-09T10:31:17Z

@AlexisPerry, are you already working on the issue? Otherwise, I can investigate more and make a fix for this issue.

AlexisPerry · 2022-02-09T16:29:27Z

@jeanPerier I've started working on it, but I haven't gotten too far yet. If you have other things you're working on, I'd prioritize those for now. I'll ping you if (when?) I get stuck. Thanks!

jeanPerier · 2022-02-09T18:33:11Z

Sounds good to me, thanks for looking into this !

jeanPerier · 2022-03-08T15:36:44Z

Hi @AlexisPerry, do not hesitate to ask if you need any help on this one. Given it can lead compiled program into undefined behaviors, I think we should try to get this bug fixed before lowering is fully upstreamed to avoid having people running into it without being easily able to link the issue to this one.

Fix #1416. The `constRows` variable was being decremented too soon, causing the last constant interior dimension extent being used to multiply the GEP offset. This lead to wrong address computation and caused segfaults.

jeanPerier · 2022-04-05T07:23:06Z

Hi @AlexisPerry and @kiranchandramohan , the fix for the reported problem was merged, please close this issue whenever you are able to verify that it did actually fix the E3SM issue.

Fix flang-compiler#1416. The `constRows` variable was being decremented too soon, causing the last constant interior dimension extent being used to multiply the GEP offset. This lead to wrong address computation and caused segfaults.

AlexisPerry · 2022-04-05T20:05:12Z

I just confirmed that the new patch fixes the error with the atmosphere kernel. It runs now (albeit slower) and gets the same answer as gfortran.

AlexisPerry · 2022-07-20T20:59:46Z

I get the following error when running mmf-mpdata-tracer built with flang-new -flang-experimental-exec:

error: loc("./advect_scalar2D_pushncols_openacc.F90":652:5): /vast/home/aperry/llvm-project/flang/lib/Lower/CustomIntrinsicCall.cpp:236: not yet implemented: unhandled dynamically optional arguments in SYSTEM_CLOCK or RANDOM_SEED

Fix flang-compiler/f18-llvm-project#1416. The `constRows` variable was being decremented too soon, causing the last constant interior dimension extent being used to multiply the GEP offset. This lead to wrong address computation and caused segfaults. Note: also upstream fir.embox tests that can be upstreamed. Differential Revision: https://reviews.llvm.org/D123130

AlexisPerry · 2022-11-04T21:45:06Z

Just checked and all the kernels build and run now.

Here is the process I used following a standard build of flang from main:

atmosphere
Add the following to the Makefile

ifdef flang
        FC = flang-new
        FFLAGS += -flang-experimental-exec -O3 -ffixed-line-length=none
endif

make flang=1
./atm

mmf-mpdata-tracer
Add the following to the Makefile

ifdef flang
        FC = flang-new
        FFLAGS = -flang-experimental-exec -O3 -ffixed-line-length=none
endif

make flang=1
./advect

nested_loops
Build openmpi-4.0.3 as described in llvm#56348
Add the openmpi install's build directory to PATH and it's lib to LD_LIBRARY_PATH
Add the following to the nested_loops Makefile

llvm:
        make nested \
        "FC=mpif90" \
        "FFLAGS=-flang-experimental-exec -O3" \
        "LDFLAGS=-O3"

make llvm
mpirun nested

kiranchandramohan assigned AlexisPerry Jan 26, 2022

banach-space added the apps Building Fortran applications with Flang - status + instructions label Feb 24, 2022

jeanPerier mentioned this issue Apr 4, 2022

[flang] Fix fir.embox codegen with constant interior shape #1564

Merged

jeanPerier closed this as completed in llvm/llvm-project@0601a0d Apr 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

E3SM-kernels #1416

E3SM-kernels #1416

kiranchandramohan commented Jan 26, 2022 •

edited

Loading

AlexisPerry commented Feb 1, 2022

jeanPerier commented Feb 2, 2022

jeanPerier commented Feb 9, 2022

AlexisPerry commented Feb 9, 2022

jeanPerier commented Feb 9, 2022

jeanPerier commented Mar 8, 2022

jeanPerier commented Apr 5, 2022

AlexisPerry commented Apr 5, 2022

AlexisPerry commented Jul 20, 2022

AlexisPerry commented Nov 4, 2022 •

edited

Loading

E3SM-kernels #1416

E3SM-kernels #1416

Comments

kiranchandramohan commented Jan 26, 2022 • edited Loading

AlexisPerry commented Feb 1, 2022

jeanPerier commented Feb 2, 2022

jeanPerier commented Feb 9, 2022

AlexisPerry commented Feb 9, 2022

jeanPerier commented Feb 9, 2022

jeanPerier commented Mar 8, 2022

jeanPerier commented Apr 5, 2022

AlexisPerry commented Apr 5, 2022

AlexisPerry commented Jul 20, 2022

AlexisPerry commented Nov 4, 2022 • edited Loading

kiranchandramohan commented Jan 26, 2022 •

edited

Loading

AlexisPerry commented Nov 4, 2022 •

edited

Loading