March 29, 2012

the bash way is faster, but only with bash

Some bashisms are syntax sugar at first sight, such as the += concatenation syntax. Usually, they happen to be faster than their more portable counterparts. But only with bash itself.

Take the following script as an example:
#!/bin/sh
# script1-portable.sh
part="$(seq 1 100000)"

for i in $(seq 1 10); do
seq="${part}"
seq="${seq}${part}"
done
$ time bash script1-portable.sh
user 0m20.837s

Now, compare to the following script that uses += :
#!/bin/sh
# script1-bash.sh
part="$(seq 1 100000)"

for i in $(seq 1 10); do
seq="${part}"
seq+="${part}"
done
$ time bash script1-bash.sh
user 0m14.227s

Yes, it's faster. However, when the first script is run with dash:
$ time dash script1-portable.sh
user    0m0.609s

[[ is another example:
#!/bin/sh
# script2-portable.sh
a="$(seq 1 100000)"; b="$(seq 1 100)"

for i in $(seq 1 10); do
[ "$a" = "$b" ]
done
$ time bash script2-portable.sh
user    0m9.148s
And the version using the bashism:
#!/bin/sh
# script2-bash.sh
a="$(seq 1 100000)"; b="$(seq 1 100)"

for i in $(seq 1 10); do
[[ $a = $b ]]
done
$ time bash script2-bash.sh
user    0m4.223s

Then again, the bash way is faster, yet it doesn't compare to dash:
$ time dash script2-portable.sh
user    0m0.588s

4 comments:

  1. Note that

    [[ $a = $b ]]

    is *not* the same as

    [ "$a" = "$b" ]

    You want:

    [[ $a = "$b" ]]

    Otherwise, $b is interpreted as pattern.

    Oh, and portably, you want:

    [ x"$a" = x"$b" ]

    Otherwise, things can break horribly.

    And for that matter, there’s mksh…

    The -fixed versions are with the above changes applied. This is the second-in-a-row run to avoid cold vs. warm cache differences.

    $ for script in script*; do for shell in bash dash mksh mksh-static; do [[ $shell = dash ]] && [[ $script = *bash* ]] && continue; echo -n $shell $script\ ; time $shell $script; done; done 2>&1 | column -t
    bash script1-bash.sh 0m3.03s real 0m2.85s user 0m0.12s system
    mksh script1-bash.sh 0m0.31s real 0m0.29s user 0m0.03s system
    mksh-static script1-bash.sh 0m0.35s real 0m0.32s user 0m0.02s system
    bash script1-portable.sh 0m4.40s real 0m4.25s user 0m0.07s system
    dash script1-portable.sh 0m0.15s real 0m0.11s user 0m0.03s system
    mksh script1-portable.sh 0m0.28s real 0m0.26s user 0m0.00s system
    mksh-static script1-portable.sh 0m0.29s real 0m0.28s user 0m0.02s system
    bash script2-bash-fixed.sh 0m1.04s real 0m0.91s user 0m0.08s system
    mksh script2-bash-fixed.sh 0m0.13s real 0m0.11s user 0m0.03s system
    mksh-static script2-bash-fixed.sh 0m0.15s real 0m0.14s user 0m0.00s system
    bash script2-portable-fixed.sh 0m1.91s real 0m1.88s user 0m0.04s system
    dash script2-portable-fixed.sh 0m0.11s real 0m0.10s user 0m0.00s system
    mksh script2-portable-fixed.sh 0m0.12s real 0m0.12s user 0m0.00s system
    mksh-static script2-portable-fixed.sh 0m0.15s real 0m0.12s user 0m0.01s system

    So mksh doesn’t lack much from dash, especially in the [[ case. mksh-static will, thanks to Iustin’s perf-null, become faster on modern CPUs soonish (it’s built with -DMKSH_SMALL, and that currently both skips code and does tradeoff-performance-for-size things like disable inlining; the latter will be disablable once I get perf to run myself so I can measure the improvements of the various patches).

    And mksh _can_ do many things the GNU bash way.

    And even things it can’t do.

    ReplyDelete
    Replies
    1. Hi Thorsten, I sort of knew somebody would comment on other shells :)

      You are right, I missed the quotes on the [[ example. FWIW, I get about the same result with the quotes added: 4.24s

      Prefixing the variables with a character had about the same results, so I preferred to leave them out to reduce the noise.

      Now, given that in your test machine the scripts run faster, you should better retry them by adding more cycles. Say, at least 50 repetitions. In my case, I considered 10 repetitions to be enough to demonstrate the difference.

      Delete
    2. Here dash is more than twice as (script1 much more) fast than mksh; fixed versions also doesn't add actual performance. I had to change the "for i in $(seq 1 10)" from 10 to 100 to get meaningful data.

      script1-bash.sh
      bash real 0m3.783s user 0m3.632s sys 0m0.150s
      mksh real 0m1.720s user 0m1.555s sys 0m0.165s
      script1-portable.sh
      bash real 0m5.552s user 0m5.427s sys 0m0.123s
      dash real 0m0.187s user 0m0.183s sys 0m0.003s
      mksh real 0m1.494s user 0m1.373s sys 0m0.122s
      script2-bash.sh
      bash real 0m0.821s user 0m0.818s sys 0m0.004s
      mksh real 0m0.523s user 0m0.484s sys 0m0.041s
      script2-bash.sh.old
      bash real 0m0.834s user 0m0.829s sys 0m0.006s
      mksh real 0m0.523s user 0m0.485s sys 0m0.037s
      script2-portable.sh
      bash real 0m2.364s user 0m2.298s sys 0m0.066s
      dash real 0m0.184s user 0m0.180s sys 0m0.004s
      mksh real 0m0.453s user 0m0.423s sys 0m0.031s
      script2-portable.sh.old
      bash real 0m2.359s user 0m2.287s sys 0m0.072s
      dash real 0m0.184s user 0m0.180s sys 0m0.004s
      mksh real 0m0.452s user 0m0.416s sys 0m0.035s

      Delete
  2. Sure, can do. But first I’ll work with nullperf. I have quite an amount of other optimisations in mind (such as pre-initialising the hash tables for things like commands and builtins during compile time, although not with gperf to keep the code size low, which is still priority).

    And I can test on m68k, which probably qualify as “slow enough” ☺ and other 1990s-era actual hardware.

    I decided to use the “correct” form just for sake of completeness and fairness of the tests. Another interesting thing would be how much time was spent in seq/jot… I guess for real tests I’ll inline their results.

    ReplyDelete