Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New builder: SuperLU_MT #983

Merged
merged 9 commits into from
May 8, 2020
Merged

New builder: SuperLU_MT #983

merged 9 commits into from
May 8, 2020

Conversation

christopher-dG
Copy link
Contributor

This is primarily for Sundials. I've left all the platforms in here, although I am 99% sure that plenty will fail -- I mostly want to use the CI to find out 😅

Question for @ViralBShah and/or @ChrisRackauckas: SuperLU_MT supports four precisions:

  • single precision real
  • double precision real
  • single precision complex
  • double precision complex

It doesn't seem like they can coexist; trying to compile them together into one shared library fails because each precision defines functions with the same name.

Is there a preference for which precision is used? I just used single-real while I was getting it working but they all compile successfully on their own. I'm not really sure if using complex means that it no longer supports real numbers or something...

@ViralBShah
Copy link
Member

I am guessing double precision real is what you want. But perhaps more a question for @ChrisRackauckas.

@ChrisRackauckas
Copy link

double precision real.

@ViralBShah
Copy link
Member

cc @jd-lara

@christopher-dG
Copy link
Contributor Author

Thanks guys. Seems that only Linux is building properly right now, it is okay to restrict compatibility to Linux or should we try to support anything else? I'm not sure what SuperLU_MT itself claims to support.

@ViralBShah
Copy link
Member

If we don't get windows and mac support, that means we'll lose that for sundials too.

@ViralBShah
Copy link
Member

ViralBShah commented May 4, 2020

I think what is required is to use gcc on all the platforms, if you use openmp. Mac otherwise uses clang.

On windows seems like a link problem that is preventing it from finding BLAS. Should be easy to fix.

@christopher-dG
Copy link
Contributor Author

Ah, now that I look at the errors, they are not so scary... I'll look into getting them fixed 🙂

@christopher-dG
Copy link
Contributor Author

On MacOS with CC = gcc, here's what I get now:

( cd SRC; make double )
make[1]: Entering directory '/workspace/srcdir/SuperLU_MT_3.1/SRC'
gcc -shared -o ../lib/libsuperlu_mt_OPENMP.so dreadhb.o dreadrb.o dmatgen.o pdgssv.o pdgssvx.o dgstrs.o dgsrfs.o dgscon.o dlacon.o dlangs.o dgsequ.o dlaqgs.o dpivotgrowth.o pdmemory.o pdutil.o dmyblas2.o pdgstrf.o pdgstrf_init.o pdgstrf_thread.o pdgstrf_thread_init.o pdgstrf_thread_finalize.o pdgstrf_factor_snode.o pdgstrf_snode_dfs.o pdgstrf_snode_bmod.o pdgstrf_panel_dfs.o pdgstrf_panel_bmod.o pdgstrf_copy_to_ucol.o pdgstrf_pivotL.o pdgstrf_column_dfs.o pdgstrf_column_bmod.o pdgstrf_bmod1D.o pdgstrf_bmod2D.o pdgstrf_bmod1D_mv2.o pdgstrf_bmod2D_mv2.o dsp_blas2.o dsp_blas3.o  superlu_timer.o dclock.o sp_ienv.o lsame.o xerbla.o util.o pmemory.o qrnzcnt.o cholnzcnt.o await.o get_perm_c.o mmd.o colamd.o sp_coletree.o pxgstrf_scheduler.o sp_colorder.o pxgstrf_mark_busy_descends.o pxgstrf_pruneL.o pxgstrf_super_bnd_dfs.o pxgstrf_relax_snode.o heap_relax_snode.o pxgstrf_synch.o pxgstrf_finalize.o dlamch.o
Undefined symbols for architecture x86_64:
  "_dasum", referenced from:
      _dlacon_ in dlacon.o
  "_daxpy", referenced from:
      _dgsrfs in dgsrfs.o
  "_dcopy", referenced from:
      _dgsrfs in dgsrfs.o
      _dlacon_ in dlacon.o
  "_dtrsv", referenced from:
      _sp_dtrsv in dsp_blas2.o
     (maybe you meant: _sp_dtrsv)
  "_idamax", referenced from:
      _dlacon_ in dlacon.o
  "_omp_get_wtime", referenced from:
      _SuperLU_timer_ in superlu_timer.o
ld: symbol(s) not found for architecture x86_64
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:108: double] Error 1
make[1]: Leaving directory '/workspace/srcdir/SuperLU_MT_3.1/SRC'
make: *** [Makefile:33: superlulib] Error 2
``

@giordano
Copy link
Member

giordano commented May 4, 2020

../lib/libsuperlu_mt_OPENMP.so for macOS?

Maybe I'm missing something, but the errors seem due to the fact that it's not linking to openblas nor libgomp. Also, the names of the symbols will be wrong, they miss the trailing 64

@christopher-dG
Copy link
Contributor Author

Oops, I should definitely have used dlext instead of hardcoding the library extension to .so

@christopher-dG
Copy link
Contributor Author

I can also get rid of the error about _omp_get_wtime by properly using -fopenmp, but the other ones remain (which are openblas functions I think).

@giordano
Copy link
Member

giordano commented May 4, 2020

You have to link to openblas, but also rename all the symbols. This is the kind of dance you need to do. I'm surprised you didn't have to for Linux

@ViralBShah
Copy link
Member

Do make sure SuperLU supports a 64-bit BLAS. Otherwise, link against OpenBLAS32.

@christopher-dG
Copy link
Contributor Author

Looks like the symbol renaming is indeed working! Taking a little screen break but I should have an update tonight.

@ViralBShah
Copy link
Member

You do need to make sure it can pass 64-bit integers to blas in the build system.

@christopher-dG
Copy link
Contributor Author

The only info I can find about 64-bit BLAS compatibility is the following:

- `-D_LONGINT`: use 64-bit integers for indexing sparse matrices. (default is 32-bit)

I've defined that on 64-bit systems that are not aarch64. Is that sufficient or is there more stuff I should be checking?

@giordano
Copy link
Member

giordano commented May 5, 2020

Good to go?

@christopher-dG
Copy link
Contributor Author

Just looking for @ViralBShah's approval on whether I've done the appropriate check for 64-bit BLAS.

@ViralBShah
Copy link
Member

ViralBShah commented May 5, 2020

It looks like they don't support a 64-bit BLAS. I only see 32-bit ints everywhere. So best to go with OPENBLAS32_jll.

The -D_LONGINT determines whether the integers in the sparse matrix data structure are 32-bit or 64-bit. I guess since the purpose is for Sundials to use this, we need to know how sundials is built. I would suggest start with LONGINT on 64-bit systems and not 32-bit systems, get it merged, and start trying out sundials. Perhaps @jd-lara might know.

The other thing to do is to build 4 versions of the library for each different precision and just give each one a different name. That way we put that worry behind us, and all precisions are available. That is how we do it in the other sparse solvers like MUMPS too.

@giordano
Copy link
Member

giordano commented May 5, 2020

The other thing to do is to build 4 versions of the library for each different precision and just give each one a different name.

So maybe it's better to reorganise the scripts so that there is a single common file that all the different versions include?

@christopher-dG
Copy link
Contributor Author

@christopher-dG
Copy link
Contributor Author

christopher-dG commented May 5, 2020

Ah never mind, the long long int is only ever int_t and not int. So the ints you pointed out are indeed 32-bit.

@christopher-dG
Copy link
Contributor Author

Ok, 32-bit BLAS it is. I can go the route of 4 separate products too if we want. Should I go for it?

@ViralBShah
Copy link
Member

Ok, 32-bit BLAS it is. I can go the route of 4 separate products too if we want. Should I go for it?

I think it would be great if you could. Simpler to do it when you are in the midst of the build system.

@jd-lara
Copy link
Contributor

jd-lara commented May 5, 2020

I am pretty sure we will need to build Sundials with only 32-bit OpenBlas to make all of the stack compatible.

@christopher-dG
Copy link
Contributor Author

Ok, on it!

@ViralBShah
Copy link
Member

ViralBShah commented May 5, 2020

I am curious to see how it works out. Currently sundials uses openblas with mangled names for 64-bit, and 32-bit names should thus not clash. It is possible that it just works out. I would at least love to try it out and see what happens.

It may also not be too hard to make superlu call 64-bit BLAS.

@jd-lara
Copy link
Contributor

jd-lara commented May 5, 2020

I am curious to see how it works out. Currently sundials uses openblas with mangled names for 64-bit, and 32-bit names should thus not clash. It is possible that it just works out. I would at least love to try it out and see what happens.

It also not be too hard to make superlu call 64-bit BLAS.

From the sundials manual:

"
Moreover, since the superlumt library may be installed to support either 32-bit or 64-bit integers, it is assumed that the superlumt library is installed using the same integer precision as the sundials sunindextype option
"

@ViralBShah
Copy link
Member

Just re-clarifying - that is only applicable to the index type in the sparse matrix data structure and not the BLAS (Both in the case of sundials and superlu)

@jd-lara
Copy link
Contributor

jd-lara commented May 5, 2020

Just re-clarifying - that is only applicable to the index type in the sparse matrix data structure and not the BLAS (Both in the case of sundials and superlu)

Yes, this applies to the size of the matrix. Basically, in theory we can use Sundials with arrays size n=2^64-1 but I have had problems when mixing 32-bit and 64-bit indexed packages even for small problems in the past.

@ViralBShah
Copy link
Member

@jd-lara When we were building sundials, we did not have openblas32_jll. Now that we have it, we can move to 32-bit blas in sundials across the board. I agree that should be sufficient.

@jd-lara
Copy link
Contributor

jd-lara commented May 5, 2020

@jd-lara When we were building sundials, we did not have openblas32_jll. Now that we have it, we can move to 32-bit blas in sundials across the board. I agree that should be sufficient.

Ok, this will also be relevant for #845 since this PR and PETSc are both relevant to expand the capabilities of Sundials.

echo "NOOPTS += -fPIC -fopenmp \$(BLASLIB)" >> make.inc

# On 64-bit systems, use 64-bit integers for indexing.
if [[ "$nbits" == 64 ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are building this for Sundials and modify Sundials to work with only 32-Bits

"
Moreover, since the superlumt library may be installed to support either 32-bit or 64-bit integers, it is assumed that the superlumt library is installed using the same integer precision as the sundials sunindextype option
"

cc. @ViralBShah

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it makes sense to make these binaries 32-bit only to avoid the 32/64-bit mess. @ChrisRackauckas fyi.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're doing 32-bit everywhere, and so the -D_LONGINT should go away?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's do that.

@christopher-dG
Copy link
Contributor Author

I think this should be good to go now? Seems like the next bit of experimentation is to do with Sundials.

@jd-lara
Copy link
Contributor

jd-lara commented May 5, 2020

I think this should be good to go now? Seems like the next bit of experimentation is to do with Sundials.

Yeah we need to make Sundials use OpenBlas32 and force the indexes to 32bit. I can try to do that over the weekend.

@christopher-dG
Copy link
Contributor Author

I had a quick go at it last night, all I did was change the OpenBLAS_jll dependency, change the lapack library path to never have the 64_ suffix, and didn't apply the Fortran patch that added the suffixes. It built fine, but I tried to run Sundials tests with it and it errored out complaining about libgfortran.so.3 not being found...

@ViralBShah
Copy link
Member

That probably just needs to expand the libgfortran versions or something, and add a dependency on CompilerSupportLibraries.

@christopher-dG
Copy link
Contributor Author

Looks like you were right, it loads properly after compiling the different libgfortran versions. I do get one test failure though.

@christopher-dG
Copy link
Contributor Author

32-bit sundials discussion can continue in #995

@christopher-dG
Copy link
Contributor Author

@giordano I think this is good to merge now.

@ViralBShah ViralBShah merged commit a3b16a9 into JuliaPackaging:master May 8, 2020
@ViralBShah
Copy link
Member

ViralBShah commented May 8, 2020

We may have to revisit the openmp choice later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants