Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory latency SIZE overflow #2

Open
mpetri opened this issue Sep 23, 2013 · 2 comments
Open

memory latency SIZE overflow #2

mpetri opened this issue Sep 23, 2013 · 2 comments

Comments

@mpetri
Copy link

mpetri commented Sep 23, 2013

I have a system with lots of ram (1.5TB) and I want to use this tool to test the memory latency when accessing different NUMA nodes. unfortunately the block SIZE variable overflows as it is stored in a integer instead of a 64bit value.

@ssvb
Copy link
Owner

ssvb commented Sep 24, 2013

That's interesting. This tool had been primarily developed for use on the embedded ARM/MIPS systems to spot the oddities/misconfiguration in the memory subsystem. I never imagined that anyone would run it on big servers :)

About the SIZE variable limitation. There is one more problem to solve: currently 32-bit http://en.wikipedia.org/wiki/Linear_congruential_generator is used for generating random offsets inside the buffer. This also needs to be reworked a bit. I'll try to see what can be done.

@mpetri
Copy link
Author

mpetri commented Sep 24, 2013

looking at the wikipedia article of the it looks like the current constants ( 1103515245 and 12345 ) can just be replaced by 6364136223846793005 and 1442695040888963407 respectively.

thanks for the awesome tool by the way. In large server systems memory access is not uniform and I want to use your tool measure the latency between different nodes in the system.

fsgeek pushed a commit to fsgeek/tinymembench that referenced this issue Feb 26, 2021
* coopy SSE2 assembly to avx2

no code changes yet

* remove non-AVX archs

remove i386 and win32

* update aligned_block_copy

update aligned_block_copy to AVX2

* convert aligned_block_copy_nt to AVX2

* convert aligned_block_copy_pf32 to AVX2

* block_copy_pf64 and block_copy_nt_pf32

* convert aligned_block_copy_nt_pf64

* convert aligned_block_fill

* convert aligned_block_fill_nt

* define __X86_AVX2_H__

* delete movsd and movsb functions

These don't seem to have a place in AVX2 routines

* update Makefile with x86-avx2

* add avx2 benchmarks (incomplete)

Still needs to:
- properly check AVX2 support
- run both SSE2 and AVX2 if AVX2 supported

* avx2 benchmarks include sse2

If AVX2, append the SSE2 benchmarks to the AVX2 benchmarks and run all of them

* simplify array sizing calcs

* separate AVX2 bi from SSE2

* update copyright notice

* define a constant for the register size

* add AVX512 assembly code

* replave vmovdqa with vmovdqa64 (?)

* add AVX512 benchmarks

* use getopt_long to add args to run SSE/AVX tests

Co-authored-by: Joel Luth <joel.luth@emc.com>
Co-authored-by: root <root@mpl037a.west.isilon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants