Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add patch for GCCcore 11.1.0 + 11.2.0 to fix AVX2 bug #17135

Merged

Conversation

Flamefire
Copy link
Contributor

@Flamefire Flamefire commented Jan 17, 2023

(created using eb --new-pr)

This fixes a serious bug when using AVX-2 intrinsics such as done by XNNPACK, a dependency of PyTorch and TensorFlow for quantization. It yields wrong results as the order of operands in the AVX vector is wrong when using an affected intrinsic and due to aliasing issues further bugs are possible based on "undefined behavior", i.e.: "everything" is possible.

See e.g. pytorch/pytorch#92246 for an actual bug caused by this and https://stackoverflow.com/a/72837992/1930508 for the post that led me to the solution in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99754

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
taurusi8029 - Linux CentOS Linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8 x NVIDIA NVIDIA A100-SXM4-40GB, 470.57.02, Python 2.7.5
See https://gist.github.com/59135df94c81dda7e568cef480c74405 for a full test report.

@boegel boegel changed the title Fix AVX2 bug in GCC 11 add patch for GCCcore 11.1.0 + 11.2.0 to fix AVX2 bug Jan 17, 2023
@boegel boegel added the bug fix label Jan 17, 2023
@boegel boegel added this to the next release (4.7.1?) milestone Jan 17, 2023
@boegel
Copy link
Member

boegel commented Jan 17, 2023

@boegelbot please test @ generoso
CORE_CNT=16
EB_ARGS="--installpath /tmp/$USER/pr17135"

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on login1

PR test command 'EB_PR=17135 EB_ARGS="--installpath /tmp/$USER/pr17135" EB_CONTAINER= /opt/software/slurm/bin/sbatch --job-name test_PR_17135 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 10009

Test results coming soon (I hope)...

- notification for comment with ID 1385929660 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
cnx2 - Linux Rocky Linux 8.5, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/d429a0c1ef2c655439bfabd0741acd16 for a full test report.

@boegel
Copy link
Member

boegel commented Jan 17, 2023

@boegelbot please test @ jsc-zen2
CORE_CNT=16
EB_ARGS="--installpath /tmp/$USER/pr17135"

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on jsczen2l1.int.jsc-zen2.easybuild-test.cluster

PR test command 'EB_PR=17135 EB_ARGS="--installpath /tmp/$USER/pr17135" /opt/software/slurm/bin/sbatch --mem-per-cpu=4000M --job-name test_PR_17135 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen2.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 2064

Test results coming soon (I hope)...

- notification for comment with ID 1386033358 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
jsczen2c1.int.jsc-zen2.easybuild-test.cluster - Linux Rocky Linux 8.5, x86_64, AMD EPYC 7742 64-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/2c944a092fd8dc026e87a582b97dfafa for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
taurusa12 - Linux CentOS Linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz (broadwell), 3 x NVIDIA GeForce GTX 1080 Ti, 460.32.03, Python 2.7.5
See https://gist.github.com/81b3e929f703166a945134aafcd92026 for a full test report.

@boegel
Copy link
Member

boegel commented Jan 18, 2023

Test report by @boegel
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
node3102.skitty.os - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz (skylake_avx512), Python 3.6.8
See https://gist.github.com/1ebf34a4fd18ccbfaa198093c5af51ba for a full test report.

Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Member

boegel commented Jan 18, 2023

Going in, thanks @Flamefire!

@boegel boegel merged commit 8222a69 into easybuilders:develop Jan 18, 2023
@boegel
Copy link
Member

boegel commented Jan 18, 2023

@boegelbot please test @ generoso
CORE_CNT=16

@Flamefire Flamefire deleted the 20230117170900_new_pr_GCCcore1110 branch January 18, 2023 11:59
@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on login1

PR test command 'EB_PR=17135 EB_ARGS= EB_CONTAINER= /opt/software/slurm/bin/sbatch --job-name test_PR_17135 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 10022

Test results coming soon (I hope)...

- notification for comment with ID 1386934278 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@verdurin
Copy link
Member

Test report by @verdurin
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
easybuild-c7.novalocal - Linux CentOS Linux 7.9.2009, x86_64, Intel Xeon Processor (Skylake, IBRS), Python 3.6.8
See https://gist.github.com/22e6c63afbe9373ad7e22024d74edd0f for a full test report.

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
cnx1 - Linux Rocky Linux 8.5, x86_64, Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz (haswell), Python 3.6.8
See https://gist.github.com/422199c50550bfc3dd5256e4538d33a2 for a full test report.

@boegel
Copy link
Member

boegel commented Jan 18, 2023

@boegelbot please test @ jsc-zen2
CORE_CNT=16

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on jsczen2l1.int.jsc-zen2.easybuild-test.cluster

PR test command 'EB_PR=17135 EB_ARGS= /opt/software/slurm/bin/sbatch --mem-per-cpu=4000M --job-name test_PR_17135 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen2.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 2071

Test results coming soon (I hope)...

- notification for comment with ID 1387138137 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
taurusml27 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), 6 x NVIDIA Tesla V100-SXM2-32GB, 440.64.00, Python 2.7.5
See https://gist.github.com/58329cafaeec06d03459706e79e50c5c for a full test report.

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
jsczen2c1.int.jsc-zen2.easybuild-test.cluster - Linux Rocky Linux 8.5, x86_64, AMD EPYC 7742 64-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/9637d697cb401fa49b08a80ceb8fbb5e for a full test report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants