BLIS should allow simultaneously exporting both 32- and 64-bit variants of BLAS/CBLAS #43

njsmith · 2016-03-01T07:07:05Z

The de facto standard is that the standard BLAS/CBLAS functions take 32-bit integers in their API. Julia experimented with changing this so that they could use 64-bit integers in their main BLAS wrappers, and this worked great for a little while until they discovered that when people started trying to link in other existing BLAS-using code, this code was assuming that BLAS uses 32-bit integers and was causing segfaults. Their solution was to continue to use a 64-bit integer version of BLAS, but with symbols renamed to avoid collisions (so e.g. dgemm_ uses 32-bit integers, and dgemm_64_ uses 64-bit integers... [edited to get the 64-bit symbol names correct])

As mentioned in #37 (comment) , it would be great if a single BLIS library could export both 32- and 64-bit versions of these symbols simultaneously. It doesn't look like this would be too hard, since both the BLAS2BLIS interface is already generated using C preprocessor magic, and the CBLAS wrapper is already getting programmatically patched...

The text was updated successfully, but these errors were encountered:

tkelman · 2016-03-01T07:22:52Z

Has also been an issue in Matlab where they use the ILP64 MKL, if you try to link a mex file against system LP64 BLAS, for many years before Julia ever existed. ILP64 BLAS only ever worked in Julia if you didn't try to load any other binary applications compiled against system LP64 BLAS. It was fragile until we implemented the renaming. Lots of context in JuliaLang/julia#4923 OpenMathLib/OpenBLAS#646 OpenMathLib/OpenBLAS#459 JuliaLang/julia#8734.

We just append 64_ to the C-style symbol names, so it's dgemm_64_, or if you try to call it from Fortran (edit: assuming gfortran mangling) it would be dgemm_64. SunPerf actually had separate API names as long as 15 years ago, ref http://www.netlib.org/atlas/atlas-comm/msg00233.html - so we went with the suffix that would let us directly use some existing code in Tim Davis' SuiteSparse that gets activated by -DSUN64.

jeffhammond · 2016-03-01T07:23:06Z

Is it possible to support dynamic dispatch, wherein both dgemm32 and dgemm64 are in the library, but at runtime dgemm is mapped to dgemm32, unless the user does e.g. export BLIS_64B_MAGIC? That would be cool.

jeffhammond · 2016-03-01T07:25:04Z

@tkelman Why doesn't Julia just call CBLAS? I thought support for that was pretty broad, at least in contexts where Julia is used.

tkelman · 2016-03-01T07:26:33Z

We do in a handful of places where the Fortran calling convention differs between gfortran and MKL/Accelerate. I don't think cblas typically supports 64 bit indices - I could be wrong though.

njsmith · 2016-03-01T07:29:45Z

@jeffhammond: That would indeed be cool, but unfortunately I don't think there's any way to make that happen given the vagaries of different platforms' linking models...

jeffhammond · 2016-03-01T15:27:00Z

@tkelman Does CBLAS need to support 64b integers to meet Julia's needs? For BLAS1 operations, it is trivial to chunk if there are more than 2B elements in a vector. Given the memory constrains, if one has a matrix dimension larger than 2B, the BLAS2/3 operations are going to perform like BLAS1 operations and thus chunking shouldn't be a big issue.

@njsmith How many different conventions need to be supported for the feature to have high value to the user community?

tkelman · 2016-03-01T15:49:08Z

It was easier for us to rename the symbols than add chunking around every single blas and lapack call. When openblas is built with ILP64 indices, its cblas also gets built with 64 bit integers afaict, but that's a non default setting and will cause segfaults if you're not careful about symbol names and loading other blas libraries in the same process.

jeffhammond · 2016-03-01T15:53:46Z

@tkelman Eww, that's terrible of OpenBLAS. They should always use C int. Someone should file that bug.

tkelman · 2016-03-01T15:57:11Z

They should also always use 32 bit ints for the fortran interface by that reasoning. The problems in practice are identical.

njsmith · 2016-03-01T22:30:42Z

How many different conventions need to be supported for the feature to have high value to the user community?

I think that the use cases that matter in practice are:

For systems with 32-bit address spaces: BLIS/BLAS/CBLAS with 32-bit integers.
For systems with 64-bit address spaces: BLIS internally using 64-bit integers, with one of the following options for the BLAS/CBLAS layer:
- dgemm_ and cblas_dgemm exist and use 32-bit integers (what Debian and others want)
- dgemm_64_ and cblas_dgemm_64 exist and use 64-bit integers (what Julia wants, also useful for Jeff's modern Fortran codes)
- dgemm_ exists and uses 64-bit integers (for bespoke setups that want to work with legacy fortran code that calls plain dgemm and is compiled with 64-bit INTEGER).

For the last case I'm not sure if anyone ever needs cblas_dgemm to be 64-bit, but I suppose it might be useful sometimes, and MKL seems to provide this as an interface ("All Intel MKL function domains support ILP64 programming but FFTW interfaces", so I guess that includes CBLAS?), so it could be useful for people porting code from MKL, and it's not hard to support.

If we combine the two compatible 64-bit options, this gives us a total of 3 configurations that are important to test/support:

32-bit BLIS
64-bit BLIS with 32-bit dgemm_/cblas_dgemm and 64-bit dgemm_64_/cblas_dgemm_64
64-bit BLIS with 64-bit dgemm_/cblas_dgemm

In principle BLIS certainly could retain the flexibility to support other configurations -- like 32-bit integers internally on 64-bit systems, or 64-bit BLIS + 32-bit dgemm_ and no 64-bit dgemm_, etc. etc., for all the combinatoric possibilities -- but trying to test and support all these seems like a waste of effort to me, since almost all of them are irrelevant in practice.

njsmith · 2016-03-01T23:11:36Z

@jeffhammond: The above list does assume that for your "modern fortran codes" use case, you don't actually care about dgemm_32 and really just want dgemm_64. Is that correct?

tkelman · 2016-03-01T23:12:17Z

I completely agree with @njsmith's summary there.

There's the question of what to do on CBLAS as well. In the OpenBLAS scenario as used by Julia, we have a handful of CBLAS symbols that we use (just cblas_[cz]dot[uc]_sub), but we use them with 64 bit integers and renamed with a 64_ suffix accordingly. cblas_cdotu_sub64_ looks a little funny but the suffix is handled by a macro anyway. I believe people are successfully building Julia against ILP64 MKL and also using those same cblas symbols but without a suffix, with a BlasInt type of 64 bits and it's been working fine. I'd have to check whether that's just luck, or if MKL also changes CBLAS integer sizes when you link (or set the environment variable for the dynamic runtime) in ILP64 mode.

njsmith · 2016-03-01T23:21:46Z

Edited my comment to speak more explicitly about CBLAS -- specifically, I think cblas_dgemm_64 and friends should obviously be supported for all the same reasons we want dgemm_64_, and a look at the MKL docs suggests that their IPL64 builds do provide 64-bit cblas_dgemm and friends.

njsmith · 2016-03-01T23:22:26Z

Oh, ugh, except I missed that Julia's 64-bit cblas functions are named like cblas_dgemm64_ instead of cblas_dgemm_64, which is obviously wrong. @tkelman, does fixing this seem at all likely? :-)

tkelman · 2016-03-01T23:32:52Z

It's not "wrong" per se, just a consequence of how we implemented it. We apply a suffix uniformly to all symbols in the library, after the gfortran mangling rather than before. Looks funny, but just easier to deal with.

tkelman · 2016-03-01T23:34:02Z

Jeez, sorry about the spam, my phone is stupid

njsmith · 2016-03-02T00:38:29Z

@tkelman: I guess by "wrong" i mean "if we were writing a standard from scratch this is obviously not what we would do". Since now we are talking about making a standard and BLIS implementing it, there's a question about whether we should follow the dorky existing thing, or fix the existing thing and then implement that :-). I guess you're voting for following the existing thing?

tkelman · 2016-03-02T01:45:15Z

"add a suffix, but before the trailing underscore for symbols that are pretending to be from fortran" (or "add 64_ for f-blas symbols and _64 for c-blas symbols") is a more fiddly rule to implement than "add the same suffix to all symbols." ILP64 is generally a situation where you should know what you are doing, or leave the linking and renaming to someone who does, so I don't care too much what the result looks like. It's almost always going to be handled by preprocessor macros anyway.

devinamatthews · 2020-08-06T20:33:41Z

I'm not seeing much consensus here. If someone wants to "create a standard" then we can try to implement it.

devinamatthews mentioned this issue Nov 2, 2016

Match default integer sizes to standard interfaces #42

Closed

devinamatthews closed this as completed Aug 6, 2020

mkrainiuk mentioned this issue May 5, 2022

CBLAS/LAPACKE extension for 64bit integer Reference-LAPACK/lapack#666

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BLIS should allow simultaneously exporting both 32- and 64-bit variants of BLAS/CBLAS #43

BLIS should allow simultaneously exporting both 32- and 64-bit variants of BLAS/CBLAS #43

njsmith commented Mar 1, 2016

tkelman commented Mar 1, 2016

jeffhammond commented Mar 1, 2016

jeffhammond commented Mar 1, 2016

tkelman commented Mar 1, 2016

njsmith commented Mar 1, 2016

jeffhammond commented Mar 1, 2016

tkelman commented Mar 1, 2016

jeffhammond commented Mar 1, 2016

tkelman commented Mar 1, 2016

njsmith commented Mar 1, 2016

njsmith commented Mar 1, 2016

tkelman commented Mar 1, 2016

njsmith commented Mar 1, 2016

njsmith commented Mar 1, 2016

tkelman commented Mar 1, 2016

tkelman commented Mar 1, 2016

njsmith commented Mar 2, 2016

tkelman commented Mar 2, 2016

devinamatthews commented Aug 6, 2020

BLIS should allow simultaneously exporting both 32- and 64-bit variants of BLAS/CBLAS #43

BLIS should allow simultaneously exporting both 32- and 64-bit variants of BLAS/CBLAS #43

Comments

njsmith commented Mar 1, 2016

tkelman commented Mar 1, 2016

jeffhammond commented Mar 1, 2016

jeffhammond commented Mar 1, 2016

tkelman commented Mar 1, 2016

njsmith commented Mar 1, 2016

jeffhammond commented Mar 1, 2016

tkelman commented Mar 1, 2016

jeffhammond commented Mar 1, 2016

tkelman commented Mar 1, 2016

njsmith commented Mar 1, 2016

njsmith commented Mar 1, 2016

tkelman commented Mar 1, 2016

njsmith commented Mar 1, 2016

njsmith commented Mar 1, 2016

tkelman commented Mar 1, 2016

tkelman commented Mar 1, 2016

njsmith commented Mar 2, 2016

tkelman commented Mar 2, 2016

devinamatthews commented Aug 6, 2020