Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Illegal instruction" error when using --clipAdapterType CellRanger4 #1218

Open
VincentGardeux opened this issue Apr 23, 2021 · 16 comments
Open
Labels
issue: code Likely to be an issue with STAR code

Comments

@VincentGardeux
Copy link

Hello,
I'm using STARsolo for aligning/demultiplexing BRB-seq libraries.
These are bulk RNA-seq libraries with a very similar construct to 10x.

I recently saw the new option --clipAdapterType CellRanger4 which sounds super cool because that's exactly the trimming we were doing.
However, when using this option, after loading the genome, it exits with an "Illegal instruction" error:

Apr 23 17:24:05 ..... started STAR run
Apr 23 17:24:05 ..... loading genome
Apr 23 17:26:19 ..... started mapping
Illegal instruction

If I use all other options but this one, the run is successful.
I tried removing some options, in case there would be some incompatibilities, but it kept failing.
Here is the simplest command I've run

STAR \
--runMode alignReads \
--runThreadN 2 \
--soloStrand Forward \
--genomeDir ${starindex} \
--soloType CB_UMI_Simple \
--soloCBstart 1 \
--soloCBlen 12 \
--soloUMIstart 13 \
--soloUMIlen 9 \
--soloCellFilter None \
--soloCBwhitelist ${barcodefile} \
---clipAdapterType CellRanger4 \
--readFilesCommand zcat \
--outSAMtype BAM SortedByCoordinate \
--outFileNamePrefix bam/ \
--readFilesIn fastq/toto_R2.fastq.gz fastq/toto_R1.fastq.gz

Thanks in advance!

@alexdobin
Copy link
Owner

Hi Vincent,

this is the problem with --clipAdapterType CellRanger4 which uses the SIMD instructions.
What's the hardware you are using?
You would need to compile STAR on your machine to use this option, by running make from the source/ directory.
If this does not work for the 2.7.8a release, please download the most recent patch from GitHub master: https://github.com/alexdobin/STAR/archive/refs/heads/master.zip
and try to compile it.

Cheers
Alex

@alexdobin alexdobin added the issue: code Likely to be an issue with STAR code label Apr 27, 2021
@VincentGardeux
Copy link
Author

Hey Alex,

Sorry for the delayed answer.
I compiled my 2.7.8a release, and indeed it seems to work now.

Thanks for the tip!

Cheers

@alexdobin
Copy link
Owner

alexdobin commented May 7, 2021

Hi Vincent,

I would recommend switching to 2.7.9a and compiling. It includes the SIMDe software that adjusts the code for a given architecture. When compiling, You would need to specify your SIMD architecture with:
make CXXFLAGS_SIMD=<arch>

Cheers
Alex

@VincentGardeux
Copy link
Author

VincentGardeux commented May 11, 2021

Hi Alex,

I followed your recommendation and compiled the latest 2.7.9a with: make CXXFLAGS_SIMD=x86_64
But I guess the processor arch is not the architecture you are talking about, coz it stops with "No such file or directory" error.

I have absolutely no clue what is the SIMD architecture of my server, how can I know that?
At some point, I was hoping that the compilation would detect and adapt automatically to my arch (since it was working directly without specifying anything on the previous version)

I also tried the same command as in the README markdown file:

make STAR CXXFLAGS_SIMD=sse

But I get the same error message:

g++ -c -I./ -std=c++11 -I/usr/include -O3 -std=c++11 -fopenmp -D'COMPILATION_TIME_PLACE="2021-05-11T12:21:30+0200 fameux:/data/software/STAR-2.7.9a/source"' -pipe -Wall -Wextra  sse opal.cpp
g++: error: sse: No such file or directory
make: *** [opal/opal.o] Error 1

So I'll stick with v.2.7.8a for now I guess...

Thanks

@alexdobin
Copy link
Owner

Hi Vincent,

sorry, there was a mistake in the README file, it should be:
make STAR CXXFLAGS_SIMD="-msse4.2"

Finding which SIMD extensions your processor support is not as easy as it should be.
You can look for sse* and avx* in
cat /proc/cpuinfo | grep flags | head -n1
Or you can find the exact model of you processor and look it up on AMD or Intel websites.

Or you can simply use
make STAR CXXFLAGS_SIMD="-march=native"
and the compiler will figure it out - though it may not be the best (fastest) option, but it should be the safest.

Cheers
Alex

@VincentGardeux
Copy link
Author

Hi Alex,

Thanks for the info. I've found the sse4_2 flag when running the cpuinfo command, so I've run the compilation with the -msse4.2 flag
This time the compilation went through with only warnings.

I tried running STAR alignReads with the --clipAdapterType CellRanger4 option and it also went through, without error.

I've also done a quick "safety check" of the output bam files, and they are perfectly identical to the ones I've got with the same command run on v.2.7.8a

So it seems all good to me from my side!
Thanks for the help!

Cheers

@piyushjo15
Copy link

Hi Alex,

I am running my STAR on HPC and I tried compiling STAR using your command make STAR CXXFLAGS_SIMD="-march=native", which worked as I was able to run STAR but the --clipAdapterType CellRanger4 didn't work for me.

Thanks,
Piyush

@hermidalc
Copy link

hermidalc commented Sep 29, 2022

Hi @alexdobin - Will this issue get fixed so that users don't need to recompile STAR to use --clipAdapterType CellRanger4? A lot of users use STAR as part of larger snakemake workflows for example, which rely on wrappers and automatic dependency install via conda/mamba.

@alexdobin
Copy link
Owner

Hi Leandro,
I am not sure what the issue is.
For some architectures, STAR needs to be compiled with different flags.
I thought that automatic installers are supposed to pull the proper executables.
I do not have the expertise/bandwidth to implement automatic detection of the architecture.

@hermidalc
Copy link

hermidalc commented Oct 4, 2022

Hi Leandro,

I am not sure what the issue is.

For some architectures, STAR needs to be compiled with different flags.

I thought that automatic installers are supposed to pull the proper executables.

I do not have the expertise/bandwidth to implement automatic detection of the architecture.

Thanks for the response, but I think bioconda (and usually conda in general) only has different STAR packages per platform (Linux x64, Mac OSX, etc) not supported CPU extensions.

I'll have to see the it's even possible to make a pull request for the bioconda STAR packaging repo so that it compiles and makes different packages depending on CPU extensions.

@pettyalex
Copy link

Hi Leandro,
I am not sure what the issue is.
For some architectures, STAR needs to be compiled with different flags.
I thought that automatic installers are supposed to pull the proper executables.
I do not have the expertise/bandwidth to implement automatic detection of the architecture.

Thanks for the response, but I think bioconda (and usually conda in general) only has different STAR packages per platform (Linux x64, Mac OSX, etc) not supported CPU extensions.

I'll have to see the it's even possible to make a pull request for the bioconda STAR packaging repo so that it compiles and makes different packages depending on CPU extensions.

It sounds like this is specifically a packaging problem. If the goal for Bioconda is to maximize compatibility, they should do what Debian Med does for distributing STAR and target SSE2: https://salsa.debian.org/med-team/rna-star/-/blob/master/debian/patches/do-not-enforce-avx2.patch

If you want to run a binary specifically suited to different CPU families, the conda packagers could do an approach like https://github.com/bwa-mem2/bwa-mem2 where you build a star binary for each instruction set: SSE2, AVX2, AVX512 and then have an entry script to find and run the right one. This is all a conda packaging problem, not a problem with STAR though.

@Lil-Psilocybe
Copy link

Hello! Reviving this thread since I am having this issue on a HPC run by a bioinformatics core at my institution and they tried updating to the most recent version of STAR. Where exactly do we implement the commands
make STAR CXXFLAGS_SIMD="-msse4.2" / make STAR CXXFLAGS_SIMD="-march=native"
cat /proc/cpuinfo | grep flags | head -n1

Is this just on commandline when I login to the HPC or do these need to go in my star command?

@pettyalex
Copy link

@Lil-Psilocybe How did you install STAR, and what is the oldest CPU architecture that you expect to run STAR on?

Those are compile time environment variables to be set at the time that STAR is compiled from source code. If you installed STAR from bioconda, that version is not compatible with older CPUs and will not run on them.

@Lil-Psilocybe
Copy link

The installs are done by the bioinfo. core and they would have the arch info; I don't use a conda environment here either. I'll make them aware to this thread. I'll be in touch with this info, thanks!

@pettyalex
Copy link

You also may try only scheduling it to run on newer nodes, whether by targeting a specific set of nodes or limiting to a certain feature support in your scheduler. AVX2 has been present on all Intel Xeon processors since 2014, and all AMD Epyc processors since 2017, so only servers that are greater than 9 years old should encounter this problem.

If your nodes are virtualized I would speculate that some hypervisor configuration could break this as well, I've seen hypervisors not pass through all supported CPU flags into VMs but I'd expect the binary would still run because the support is actually there.

@Lil-Psilocybe
Copy link

We got it to run! Just recompiling and then running on Xeon proceesors got it to work, thanks for your detailed response though! I'll be back in case anything else comes up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
issue: code Likely to be an issue with STAR code
Projects
None yet
Development

No branches or pull requests

6 participants