Sunday 26 October 2014

even more stress in stress-ng

Over the past few weeks in spare moments I've been adding more stress methods to stress-ng  ready for Ubuntu 15.04 Vivid Vervet.   My intention is to produce a rich set of stress methods that can stress and exercise many facets of a system to force out bugs, catch thermal over-runs and generally torture a kernel in a controlled repeatable manner.

I've also re-structured the tool in several ways to enhance the features and make it easier to maintain.  The cpu stress method has been re-worked to include nearly 40 different ways to stress a processor, covering:
  • Bit manipulation: bitops, crc16, hamming
  • Integer operations: int8, int16, int32, int64, rand
  • Floating point:  long double, double,  float, ln2, hyperbolic, trig
  • Recursion: ackermann, hanoi
  • Computation: correlate, euler, explog, fibonacci, gcd, gray, idct, matrixprod, nsqrt, omega, phi, prime, psi, rgb, sieve, sqrt, zeta
  • Hashing: jenkin, pjw
  • Control flow: jmp, loop
..the intention was to have a wide enough eclectic mix of CPU exercising tests that cover a wide range of typical operations found in computationally intense software.   Use the new --cpu-method option to select the specific CPU stressor, or --cpu-method all to exercise all of them sequentially.

I've also added more generic system stress methods too:
  • bigheap - re-allocs to force OOM killing
  • rename - rename files rapidly
  • utime - update file modification times to create lots of dirty file metadata
  • fstat - rapid fstat'ing of large quantities of files
  • qsort - sorting of large quantities of random data
  • msg - System V message sending/receiving
  • nice - rapid re-nicing processes
  • sigfpe - catch rapid division by zero errors using SIGFPE
  • rdrand - rapid reading of Intel random number generator using the rdrand instruction (Ivybridge and later CPUs only)
Other new options:
  • metrics-brief - this dumps out only the bogo-op metrics that are relevant for just the tests that were run.
  • verify - this will sanity check the stress results per iteration to ensure memory operations and CPU computations are working as expected. Hopefully this will catch any errors on a hot machine that has errors in the hardware. 
  • sequential - this will run all the stress methods one by one (for a default of 60 seconds each) rather than all in parallel.   Use this with the --timeout option to run all the stress methods sequentially each for a specified amount of time. 
  • Specifying 0 instances of any stress method will run an instance of the stress method on all online CPUs. 
The tool also builds and runs on Debian kFreeBSD and GNU HURD kernels although some stress methods or stress options are not included due to lack of support on these other kernels.
The stress-ng man page gives far more explanation of each stress method and more detailed examples of how to use the tool.

For more details, visit here or read the manual.