Simple benchmark for memory throughput and latency
This is a simple memory benchmark program, which tries to measure the peak bandwidth of sequential memory accesses and the latency of random memory accesses. Bandwidth is measured by running different assembly code for the aligned memory blocks and attempting different prefetch strategies.
Check log files to get whole output.
Find even more data in the wiki.
....
---
standard memcpy : 30251.8 MB/s (22.9%)
standard memset : 42933.1 MB/s (8.2%)
---
....
---
standard memcpy : 7546.7 MB/s (0.6%)
standard memset : 28423.6 MB/s (3.6%)
---
---
standard memcpy : 7093.3 MB/s (0.6%)
standard memset : 10988.2 MB/s (0.5%)
---
....
---
standard memcpy : 8036.5 MB/s (1.0%)
standard memset : 33126.4 MB/s (7.6%)
---
....
---
standard memcpy : 15452.2 MB/s (0.2%)
standard memset : 32821.6 MB/s (0.4%)
---
The experiments aim to represent memory access speed. I specified CPUs just for indexing sake. It would be more accurate to represent memory characteristics here for each particular case. Also, more experiments should have been conducted to get more stable data.
Personally tested on Apple’s M1 air , i9 pro macbooks; Android Aarch64 mobile phone and tablet with cortex-a76, cortex-a55 cores, and Linux x86 i7 desktop machine.
Alekum(Rojaster)