ash ${str/find/repl} performance

Ron Yorston rmy at pobox.com
Wed Jul 21 12:52:38 UTC 2021


I wrote:
>It would be possible to use a more efficient approach when the pattern
>is known to have no special characters.  But that would require more
>code.

To quantify this I made a patch which handles the special case of
patterns that don't require globbing.  The feature is controlled by
the ASH_OPTIMIZE_FOR_SIZE configuration option so it isn't enabled
in the default build (where ASH_OPTIMIZE_FOR_SIZE is enabled).
When it is enabled it adds 183 bytes.

Benchmarks with the feature enabled follow.  This command uses a
bracket expression so globbing is required:

hyperfine -L n 10,15,20,30 -S ./ash -w 2 -r 10 -s basic '
    x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -{n});
    for i in $(seq 1 20); do echo "${x//[:]/|}"; done'

Benchmark #1: 
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -10);
        for i in $(seq 1 20); do echo "${x//[:]/|}"; done
  Time (mean ± σ):      36.9 ms ±   0.6 ms    [User: 36.3 ms, System: 1.1 ms]
  Range (min … max):    36.2 ms …  38.4 ms    10 runs
 
Benchmark #2: 
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -15);
        for i in $(seq 1 20); do echo "${x//[:]/|}"; done
  Time (mean ± σ):     104.6 ms ±   1.4 ms    [User: 104.0 ms, System: 1.0 ms]
  Range (min … max):   103.1 ms … 107.6 ms    10 runs
 
Benchmark #3: 
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -20);
        for i in $(seq 1 20); do echo "${x//[:]/|}"; done
  Time (mean ± σ):     241.3 ms ±   2.7 ms    [User: 240.5 ms, System: 1.1 ms]
  Range (min … max):   238.2 ms … 246.7 ms    10 runs
 
Benchmark #4: 
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -30);
        for i in $(seq 1 20); do echo "${x//[:]/|}"; done
  Time (mean ± σ):     697.1 ms ±   4.0 ms    [User: 695.6 ms, System: 1.0 ms]
  Range (min … max):   691.8 ms … 701.8 ms    10 runs

Summary
  '
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -10);
        for i in $(seq 1 20); do echo "${x//[:]/|}"; done' ran
    2.84 ± 0.06 times faster than '
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -15);
        for i in $(seq 1 20); do echo "${x//[:]/|}"; done'
    6.54 ± 0.14 times faster than '
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -20);
        for i in $(seq 1 20); do echo "${x//[:]/|}"; done'
   18.90 ± 0.35 times faster than '
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -30);
        for i in $(seq 1 20); do echo "${x//[:]/|}"; done'

The elapsed times and the increase in time with variable length are
similar to the results reported previously.

This command uses a literal search string:

hyperfine -L n 10,15,20,30 -S ./ash -w 2 -r 10 -s basic '
    x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -{n});
    for i in $(seq 1 1000); do echo "${x//:/|}"; done'

Benchmark #1: 
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -10);
        for i in $(seq 1 1000); do echo "${x//:/|}"; done
  Time (mean ± σ):       9.0 ms ±   0.6 ms    [User: 8.4 ms, System: 1.2 ms]
  Range (min … max):     8.5 ms …  10.5 ms    10 runs
 
Benchmark #2: 
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -15);
        for i in $(seq 1 1000); do echo "${x//:/|}"; done
  Time (mean ± σ):      12.5 ms ±   0.1 ms    [User: 11.6 ms, System: 1.5 ms]
  Range (min … max):    12.4 ms …  12.7 ms    10 runs
 
Benchmark #3: 
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -20);
        for i in $(seq 1 1000); do echo "${x//:/|}"; done
  Time (mean ± σ):      17.9 ms ±   1.1 ms    [User: 17.6 ms, System: 0.9 ms]
  Range (min … max):    17.0 ms …  19.5 ms    10 runs
 
Benchmark #4: 
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -30);
        for i in $(seq 1 1000); do echo "${x//:/|}"; done
  Time (mean ± σ):      27.2 ms ±   0.9 ms    [User: 26.7 ms, System: 1.1 ms]
  Range (min … max):    26.0 ms …  28.4 ms    10 runs

Summary
  '
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -10);
        for i in $(seq 1 1000); do echo "${x//:/|}"; done' ran
    1.39 ± 0.09 times faster than '
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -15);
        for i in $(seq 1 1000); do echo "${x//:/|}"; done'
    1.99 ± 0.18 times faster than '
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -20);
        for i in $(seq 1 1000); do echo "${x//:/|}"; done'
    3.03 ± 0.22 times faster than '
        x=$(cat /etc/passwd /etc/passwd /etc/passwd | head -30);
        for i in $(seq 1 1000); do echo "${x//:/|}"; done'

The elapsed times are much shorter:  I had to increase the number of
iterations from 20 to 1000 to get reliable measurements.  The increase
in time with variable length is linear.

Ron


More information about the busybox mailing list