Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenFAST dev branch segfaults Intel OneAPI 2023.2 #2135

Closed
jrood-nrel opened this issue Apr 1, 2024 · 24 comments
Closed

OpenFAST dev branch segfaults Intel OneAPI 2023.2 #2135

jrood-nrel opened this issue Apr 1, 2024 · 24 comments
Milestone

Comments

@jrood-nrel
Copy link
Collaborator

jrood-nrel commented Apr 1, 2024

modules/aerodyn/src/AeroDyn_Inflow.f90: error #5623: **Internal compiler error: internal abort** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.

Can someone try building OpenFAST where AeroDyn_Inflow.f90 is compiled? I believe it should segfault Intel OneAPI 2023.2 using the LLVM compilers.

@andrew-platt
Copy link
Collaborator

andrew-platt commented Apr 1, 2024

That's an odd error. We'll definitely have to fix that.

FYI: AeroDyn_Inflow.f90 is only used with the AeroDyn_Driver and AeroDyn_Inflow_C_Bindings interface, not with OpenFAST. But it is included in the aerodynlib and gets built with AeroDyn as a result -- not sure why we did it that way.

@andrew-platt
Copy link
Collaborator

andrew-platt commented Apr 2, 2024

I don't know for certain if PR #2136 will solve this issue. @jrood-nrel, could you check if that branch fixes it?

@deslaughter
Copy link
Collaborator

@andrew-platt I made the mistake of including AeroDyn_Inflow.f90 in aerodynlib as part of refactoring the CMake files for v3.5.0. At that point I didn't understand the separate use of this library. Thanks for fixing it.

@jrood-nrel
Copy link
Collaborator Author

Sure I will try it.

@andrew-platt
Copy link
Collaborator

@jrood-nrel, I merged this into dev, so you can grab that branch instead if it is easier.

@bjonkman
Copy link
Contributor

bjonkman commented Apr 2, 2024

FYI, Intel OneAPI 2024.1 also gives an internal compiler error on ModVar.f90 in the dev-unstable-pointers branch. I have traced it to two lines that are supposed to perform automatic deallocation/allocation of a type.
line 568:
image
and line 651:
image

@andrew-platt
Copy link
Collaborator

@bjonkman, Thanks for the info! @deslaughter, do we have other instances of the automatic deallocation/allocation that may not be supported by all compilers yet?

@jrood-nrel
Copy link
Collaborator Author

I still see this with the dev branch. I am using the CPP bindings btw, so does that mean I always compile this file?

modules/aerodyn/src/AeroDyn_Inflow.f90: error #5623: **Internal compiler error: internal abort** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.

@andrew-platt
Copy link
Collaborator

andrew-platt commented Apr 2, 2024

There must still be an issue in AeroDyn_Inflow.f90 itself, and not just an issue with the cmake libs.

Are you compiling aerodyn_inflow_c_bindings as well as the the OpenFAST CPP interface?

@jrood-nrel
Copy link
Collaborator Author

Paths redacted:

-DBUILD_DOCUMENTATION:BOOL=OFF -DBUILD_TESTING:BOOL=OFF -DBUILD_SHARED_LIBS:BOOL=ON -DDOUBLE_PRECISION:BOOL=ON -DUSE_DLL_INTERFACE:BOOL=ON -DBUILD_OPENFAST_CPP_API:BOOL=ON -DCMAKE_POSITION_INDEPENDENT_CODE:BOOL=ON -DBLAS_LIBRARIES:STRING=/path/lib/libopenblas.so -DLAPACK_LIBRARIES:STRING=/path/lib/libopenblas.so -DCMAKE_CXX_COMPILER:STRING=/path/mpicxx -DCMAKE_C_COMPILER:STRING=/path/mpicc -DCMAKE_Fortran_COMPILER:STRING=/path/mpif90 -DMPI_CXX_COMPILER:STRING=/path/mpicxx -DMPI_C_COMPILER:STRING=/path/mpicc -DMPI_Fortran_COMPILER:STRING=/path/mpif90 -DHDF5_ROOT:STRING=/path -DYAML_ROOT:STRING=/path -DHDF5_NO_FIND_PACKAGE_CONFIG_FILE:BOOL=ON -DNETCDF_ROOT:STRING=/path

@andrew-platt
Copy link
Collaborator

What is your make command?

@jrood-nrel
Copy link
Collaborator Author

make

@andrew-platt
Copy link
Collaborator

andrew-platt commented Apr 2, 2024

Ah. That ends up building all targets including aerodyn_driver and aerodyn_inflow_c_bindings. I'm guessing you don't need all the module drivers, module wrappers, or TurbSim.

As a temporary workaround, could you specify only the targets of interest make openfast openfastcpp?

@andrew-platt
Copy link
Collaborator

From @deslaughter

Intel's LLVM based fortran compiler is really new. We haven't said we're going to support it yet, AFAIK

We will work towards fully supporting Intel's LLVM in the future, but I don't know how soon that will be. So if you can work around the issue by specifying the targets, that would be preferable while we find time/resources to fully test with Intel's LLVM.

@deslaughter
Copy link
Collaborator

FYI, Intel OneAPI 2024.1 also gives an internal compiler error on ModVar.f90 in the dev-unstable-pointers branch. I have traced it to two lines that are supposed to perform automatic deallocation/allocation of a type.

@bjonkman Thanks for tracing this down. I'm really surprised this is causing an issue since it's part of the Fortran 2003 standard. I'll take a look and see if I can figure out what's happening.

Do we have other instances of the automatic deallocation/allocation that may not be supported by all compilers yet?

I think that I was mostly using it in the new tight coupling code, though I remember seeing an instance in AeroDyn that caused problems with Flang, though I don't think it was related to ADI.

@jrood-nrel
Copy link
Collaborator Author

So using make openfast openfastcpp gets past building, but I will need make install to not build all the targets because it just fails with the segfault during make install. How can I do that?

@bjonkman
Copy link
Contributor

bjonkman commented Apr 2, 2024

FYI, Intel OneAPI 2024.1 also gives an internal compiler error on ModVar.f90 in the dev-unstable-pointers branch. I have traced it to two lines that are supposed to perform automatic deallocation/allocation of a type.

@bjonkman Thanks for tracing this down. I'm really surprised this is causing an issue since it's part of the Fortran 2003 standard. I'll take a look and see if I can figure out what's happening.

When I replace it with a more traditional Fortran (and longer) allocation method, it works.
image

Internal compiler errors are bugs in the compiler, which may or may not be caused by valid code. I actually misspoke on what I was using to build. The 32-bit compiler is deprecated, so it's using a slightly different version. The error is with Intel Fortran Compiler Classic 2021.12.0 [IA-32] (IFORT). The Intel Fortran Compiler 2024.1.0 [Intel 64] (IFX) builds the original code fine.

@deslaughter
Copy link
Collaborator

@bjonkman Thanks for clarifying. I'm glad that the traditional method still works.

@jrood-nrel @andrew-platt I've tracked the ADI library compiler bug to being caused by OpenMP. Without OpenMP enabled, the compile succeeds. OpenMP is being enabled by use of the C++ API. ADI_CalcOutput_IW uses $OMP, maybe there's something wrong with those comments. I'll dig a little more.

@jrood-nrel
Copy link
Collaborator Author

Ok I will try that. We already disable OpenMP in this line when building with MacOS on our laptops to avoid build errors

if (OPENMP OR BUILD_FASTFARM OR BUILD_OPENFAST_CPP_API)

@deslaughter
Copy link
Collaborator

@andrew-platt I think we should remove the OMP comments from the section of code that's causing the issue.

!$OMP PARALLEL DEFAULT(SHARED)
Unless there are a huge number of points, I don't expect splitting the loop over multiple threads to significantly increase performance.

@jrood-nrel
Copy link
Collaborator Author

Disabling OpenMP solves the segfault for us. We will just always disable OpenMP. Thanks for the help.

@andrew-platt
Copy link
Collaborator

andrew-platt commented Apr 2, 2024

I agree on removing OMP from AeroDyn_Inflow.f90. This may incur a performance penalty in the FVW module when we have ~100k points, but we intend to change how the data is accessed there with the introduction of the FlowField data structure in dev-unstable-pointers.

See #2140

@andrew-platt andrew-platt added this to the v4.0.0 milestone Apr 2, 2024
andrew-platt added a commit to andrew-platt/openfast that referenced this issue Apr 2, 2024
This was causing a compiler fault with OneAPI 2023.2 with LLVM (see issue OpenFAST#2135)
@andrew-platt
Copy link
Collaborator

In theory, #2140 should fix the segfault of the OneAPI compiler

@andrew-platt
Copy link
Collaborator

Fix included in v4.0.0 (#2586)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants