Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E3SM-kernels #1416

Closed
kiranchandramohan opened this issue Jan 26, 2022 · 10 comments
Closed

E3SM-kernels #1416

kiranchandramohan opened this issue Jan 26, 2022 · 10 comments
Assignees
Labels
apps Building Fortran applications with Flang - status + instructions

Comments

@kiranchandramohan
Copy link
Collaborator

kiranchandramohan commented Jan 26, 2022

Enable building and execution of E3SM kernels with the flang compiler.

@AlexisPerry
Copy link
Collaborator

The following is a reproducer for a segmentation fault that I get when running the atmosphere kernel. There are a number of comments inline about changes that make the segfault disappear (prefixed with !NOTE). Any suggestions for where to look to fix this? I'm compiling with flang-new -O3 -ffixed-line-length=none -L<path to libpgmath> -lpgmath

!!! UTILS MODULE
module kernel_utils
  implicit none
  public
  integer , parameter :: np = 4 !NOTE: removing "parameter" fixes segfault

  integer :: nets = 1
  integer :: nete = 16

  real, allocatable :: qtens(:,:,:,:)

contains
  subroutine init(n,a)
    implicit none
    integer , intent(in   ) :: n
    real    , intent(  out) :: a(n)
    integer :: i

    do i = 1 , n
      a(i) = 1.0
    enddo
  end subroutine init
end module kernel_utils

!!! COMPUTE MODULE
module subrt_cpu
  use kernel_utils
  implicit none
  private

  public :: subrt

contains
  function fxn(s) result(laplace)
    real, intent(in   ) :: s(np)
    real :: laplace(np)
    laplace = s !NOTE: not using s fixes the segfault
  end function fxn

  subroutine subrt(qtens)
    real , intent(inout) :: qtens(np,72,40,nets:nete) !NOTE: removing nets:nete fixes segfault
    integer :: ie
    do ie = 1 , 16 !NOTE: removing this loop fixes the segfault
    !NOTE: with loop removed, segfault occurs for ie > 5
    !NOTE: regardless of loop, removing the call to fxn fixes the segfault
    !NOTE: regardless of loop, removing any dimesion of the array fixes segfault (i.e. need at least 4D array)
        qtens(:,1,1,ie) = fxn(qtens(:,1,1,ie))
    enddo
  end subroutine subrt
end module subrt_cpu

!!! MAIN
program subrt_kernel
  use kernel_utils
  use subrt_cpu
  implicit none

  allocate(qtens(np, 72, 40, 16))
  call init( product(shape(qtens)) , qtens ) !Is this not actually initing things properly? i.e. misshapen arrays?
  call subrt(qtens)
end program subrt_kernel

@jeanPerier
Copy link
Collaborator

Hi @AlexisPerry, I think there is something wrong with the base address computation of qtens(:,1,1,ie). I made a small FIR repro below that is supposed to print the address of qtens(:,1,1,ie) with ie=2 assuming qtens base address is zero, and it prints 1843200 while I would have expect it to print 4*72*40*(2-1)*sizeof(f32)=46080.

So if you want to look into it, it might be worth looking at fir.embox codegen base address compuation (here)

Here is my small small FIR repro that you can with tco+llc and link with Fortran runtime:

func @print_addr(%arg0: !fir.ref<!fir.array<?xf32>>) {
  %c1 = arith.constant 1 : index
  %c-1_i32 = arith.constant -1 : i32
  %2 = fir.address_of(@_QQcl.2E2F726570726F2E66393000) : !fir.ref<!fir.char<1,12>>
  %3 = fir.convert %2 : (!fir.ref<!fir.char<1,12>>) -> !fir.ref<i8>
  %c28_i32 = arith.constant 28 : i32
  %4 = fir.call @_FortranAioBeginExternalListOutput(%c-1_i32, %3, %c28_i32) : (i32, !fir.ref<i8>, i32) -> !fir.ref<i8>
  %12 = fir.convert %arg0 : (!fir.ref<!fir.array<?xf32>>) -> i64
  %13 = fir.call @_FortranAioOutputInteger64(%4, %12) : (!fir.ref<i8>, i64) -> i1
  %14 = fir.call @_FortranAioEndIoStatement(%4) : (!fir.ref<i8>) -> i32
  return
}

func @test(%arg0: !fir.ref<!fir.array<4x72x40x?xf32>>, %nets: index, %nete: index, %at : index) {
    %c1 = arith.constant 1 : index
    %c4 = arith.constant 4 : index
    %c7 = arith.constant 7 : index
    %c40 = arith.constant 40 : index
    %undef = fir.undefined index
    %41 = fir.shape_shift %c1, %c4, %c1, %c7, %c1, %c40, %nets, %nete : (index, index, index, index, index, index, index, index) -> !fir.shapeshift<4>
    %42 = fir.slice %c1, %c7, %c1, %c1, %undef, %undef, %c1, %undef, %undef, %at, %undef, %undef : (index, index, index, index, index, index, index, index, index, index, index, index) -> !fir.slice<4>
    %43 = fir.embox %arg0(%41) [%42] : (!fir.ref<!fir.array<4x72x40x?xf32>>, !fir.shapeshift<4>, !fir.slice<4>) -> !fir.box<!fir.array<?xf32>>
    %44 = fir.box_addr %43 : (!fir.box<!fir.array<?xf32>>) -> !fir.ref<!fir.array<?xf32>>
    fir.call @print_addr(%44) : (!fir.ref<!fir.array<?xf32>>) -> ()
    return
}
func @_QQmain() {
  %c0 = arith.constant 0 : i64
  %nets = arith.constant 1 : index
  %nete = arith.constant 16 : index
  %at = arith.constant 2 : index
  %nullptr = fir.convert %c0 : (i64) -> !fir.ref<!fir.array<4x72x40x?xf32>>
  fir.call @test(%nullptr, %nets, %nete, %at) : (!fir.ref<!fir.array<4x72x40x?xf32>>, index, index, index) -> ()
  return
}

func private @_FortranAioBeginExternalListOutput(i32, !fir.ref<i8>, i32) -> !fir.ref<i8> attributes {fir.io, fir.runtime}
func private @_FortranAioOutputInteger64(!fir.ref<i8>, i64) -> i1 attributes {fir.io, fir.runtime}
func private @_FortranAioEndIoStatement(!fir.ref<i8>) -> i32 attributes {fir.io, fir.runtime}
fir.global linkonce @_QQcl.2E2F726570726F2E66393000 constant : !fir.char<1,12> {
  %0 = fir.string_lit "./repro.f90\00"(12) : !fir.char<1,12>
  fir.has_value %0 : !fir.char<1,12>
}

I suspect something is wrong with the fir.embox in @test function. Assuming the mlir file is called test.mlir, you can compile and run with:

llvmbuild= <your llvm build>
llvmsrc= <your llvm src>


filename=test
firFile=$filename.mlir
llFile=$filename.mlir.ll
objFile=$filename.mlir.o
exeFile=$filename


$llvmbuild/bin/tco -o $llFile $firFile
$llvmbuild/bin/llc --relocation-model=pic --filetype=obj $llFile
gcc -I$llvmsrc/flang/include $llvmsrc/flang/runtime/Fortran_main.c -fPIC -c
gcc Fortran_main.o $objFile -L$llvmbuild/lib -lFortranRuntime -lFortranDecimal -L$pgmathPath -lpgmath -lm -o $exeFile

@jeanPerier
Copy link
Collaborator

@AlexisPerry, are you already working on the issue? Otherwise, I can investigate more and make a fix for this issue.

@AlexisPerry
Copy link
Collaborator

@jeanPerier I've started working on it, but I haven't gotten too far yet. If you have other things you're working on, I'd prioritize those for now. I'll ping you if (when?) I get stuck. Thanks!

@jeanPerier
Copy link
Collaborator

Sounds good to me, thanks for looking into this !

@banach-space banach-space added the apps Building Fortran applications with Flang - status + instructions label Feb 24, 2022
@jeanPerier
Copy link
Collaborator

Hi @AlexisPerry, do not hesitate to ask if you need any help on this one. Given it can lead compiled program into undefined behaviors, I think we should try to get this bug fixed before lowering is fully upstreamed to avoid having people running into it without being easily able to link the issue to this one.

jeanPerier added a commit that referenced this issue Apr 4, 2022
Fix #1416.

The `constRows` variable was being decremented too soon, causing the
last constant interior dimension extent being used to multiply the GEP
offset. This lead to wrong address computation and caused segfaults.
jeanPerier added a commit that referenced this issue Apr 4, 2022
Fix #1416.

The `constRows` variable was being decremented too soon, causing the
last constant interior dimension extent being used to multiply the GEP
offset. This lead to wrong address computation and caused segfaults.
@jeanPerier
Copy link
Collaborator

Hi @AlexisPerry and @kiranchandramohan , the fix for the reported problem was merged, please close this issue whenever you are able to verify that it did actually fix the E3SM issue.

jeanPerier added a commit to jeanPerier/llvm-project that referenced this issue Apr 5, 2022
Fix flang-compiler#1416.

The `constRows` variable was being decremented too soon, causing the
last constant interior dimension extent being used to multiply the GEP
offset. This lead to wrong address computation and caused segfaults.
@AlexisPerry
Copy link
Collaborator

I just confirmed that the new patch fixes the error with the atmosphere kernel. It runs now (albeit slower) and gets the same answer as gfortran.

@AlexisPerry
Copy link
Collaborator

I get the following error when running mmf-mpdata-tracer built with flang-new -flang-experimental-exec:

error: loc("./advect_scalar2D_pushncols_openacc.F90":652:5): /vast/home/aperry/llvm-project/flang/lib/Lower/CustomIntrinsicCall.cpp:236: not yet implemented: unhandled dynamically optional arguments in SYSTEM_CLOCK or RANDOM_SEED

mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
Fix flang-compiler/f18-llvm-project#1416.

The `constRows` variable was being decremented too soon, causing the
last constant interior dimension extent being used to multiply the GEP
offset. This lead to wrong address computation and caused segfaults.

Note: also upstream fir.embox tests that can be upstreamed.

Differential Revision: https://reviews.llvm.org/D123130
@AlexisPerry
Copy link
Collaborator

AlexisPerry commented Nov 4, 2022

Just checked and all the kernels build and run now.

Here is the process I used following a standard build of flang from main:

atmosphere
Add the following to the Makefile

ifdef flang
        FC = flang-new
        FFLAGS += -flang-experimental-exec -O3 -ffixed-line-length=none
endif

make flang=1
./atm

mmf-mpdata-tracer
Add the following to the Makefile

ifdef flang
        FC = flang-new
        FFLAGS = -flang-experimental-exec -O3 -ffixed-line-length=none
endif

make flang=1
./advect

nested_loops
Build openmpi-4.0.3 as described in llvm#56348
Add the openmpi install's build directory to PATH and it's lib to LD_LIBRARY_PATH
Add the following to the nested_loops Makefile

llvm:
        make nested \
        "FC=mpif90" \
        "FFLAGS=-flang-experimental-exec -O3" \
        "LDFLAGS=-O3"

make llvm
mpirun nested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apps Building Fortran applications with Flang - status + instructions
Projects
None yet
Development

No branches or pull requests

4 participants