November 13, 2013

A bashism a week: heredocs

One great feature of POSIX shells is the so-called heredoc. They are even available in languages such as Perl, PHP, and Ruby.

So where is the bashism?

It's in the implementation. What odd thing do you see below?


$ strace -fqe open bash -c 'cat <<EOF
foo
EOF' 2>&1 | grep -v /lib
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/dev/tty", O_RDWR|O_NONBLOCK|O_LARGEFILE) = 3
open("/proc/meminfo", O_RDONLY|O_CLOEXEC) = 3
[pid 6696] open("/tmp/sh-thd-1384296303", O_WRONLY|O_CREAT|O_EXCL|O_TRUNC|O_LARGEFILE, 0600) = 3
[pid 6696] open("/tmp/sh-thd-1384296303", O_RDONLY|O_LARGEFILE) = 4
[pid 6696] open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
foo
--- SIGCHLD (Child exited) @ 0 (0) ---


Yes, it uses temporary files!

So do ksh, pdksh, mksh, posh and possible other shells. Busybox's sh and dash do not use temporary files, though:


$ strace -fqe open dash -c 'cat <<EOF
foo
EOF' 2>&1 | grep -v /lib
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
[pid 6767] open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
foo
--- SIGCHLD (Child exited) @ 0 (0) ---


Next time you want data to never hit a hard disk, beware that heredocuments and herestrings are best avoided.

October 09, 2013

A bashism a week: maths

You've probably already done some basic maths in shell scripts, but do you know what else you can actually do?

Pick at least 4 operations that you can do in bashisms-free shell scripts:

$((n+1))
$((n>8))
$((n^4))
$((--n))
$((n*=5))
$((n++))
$((n==1?2:3))

The POSIX:2001 standard defines the arithmetic expansion requirements, which leads us to selecting all of the above operations except two:

$((--n))
$((n++))

"--" and "++" are not required to be implemented, and in some cases they may lead to unexpected results, such as the following:


$ bash -c 'n=1; echo $((++n))'
2
$ dash -c 'n=1; echo $((++n))'
1


Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

October 08, 2013

Faster, more stable and new opportunities

A new version of the code behind http.debian.net was deployed a few days ago. It has proved to be far more stable, faster, and scalable compared to the previous mod_perl-based deployment.

There were a couple of glitches during the earlier roll-out for IPv6 users, fixed thanks to the reports by Cyril Brulebois, Michael Stapelberg and Robert Drake.

What's behind?
The redirector is now a plack application (with no middlewere) running under the Starman server with an apache fronted. Requests are processed faster than before and there mod_perl-induced system overload is finally gone.

The redirector is now easier to test and develop. Deploying the live instance is not yet fully streamlined, but it has seen a lot of improvement. Some important changes to the way the redirector works are already on their way to see the light and I am going to be announcing them when they do. Fork the repository and hack a few changes, contributions are welcome :)

It's probably time to move it under debian.org, and finish making it the default everywhere. It's even made its way into the installation manual.

October 02, 2013

A bashism a week: dangerous exports

As a user of a shell you have most likely had the need to export a variable to another process; i.e. set/modify an environment variable.

Now, how do you stop exporting an environment variable? can you export anything else?

The bash shell offers the -n option of the export built-in, claiming it "remove[s] the export property". However, this feature is not part of the POSIX:2001 standard and is, therefore, a bashism.

A portable way to stop exporting an environment variable is to unset it. E.g. the effect of "export MY_VAR=foo" can be reverted by calling "unset MY_VAR" - surely enough, this will also destroy the content of the variable.

An equivalent could then be:

# to stop exporting/"unexport" the MY_VAR environment variable:
my_var="$MY_VAR" ; unset MY_VAR ;
MY_VAR="$my_var" ; unset my_var ;


The above code will make a copy of the variable before destroying it and then restoring its content.

How about exporting other things? did you know that you can export shell functions?

With the bash shell, you can export a function with the -f parameter of the export built-in. Needless to say, this is a bashism. Its secret? it's just an environment variable with the name of the function and the rest of the function definition as its value.

Try this:

$ echo="() { /bin/echo 'have some bash' ; }" bash -c 'echo "Hello World!"'
have some bash


Yes, this means that if you can control the content of an environment variable passed to bash you can probably execute whatever code you want. It comes handy when you want to alter a script's behaviour without modifying the script itself.

Possibilities are endless thanks to bash's support for non-standard characters in function names. Functions with slashes can also be exported, for example:

/sbin/ifconfig() {
echo "some people say you should be using ip(1) instead" ;
}


Are you into bug hunting? export exec='() { echo mount this ; }'

September 25, 2013

A bashism a week: aliases

In a response to my blogpost about bashisms in function names, reader Detlef L pointed out in a comment that aliases allow non-standard characters in their names, contrary to functions. They could then be used to, for example, set an alias of the run-parts(1) command (cf. the blog post).

Aliases indeed allow characters such as commas ( , ) to be used in the alias name. However, aliases are an extension to the POSIX:2001 specification and are therefore bashisms. Moreover, the characters set defined by POSIX does not include dashes.

Last but not least, aliases belong to the list of shell features that are usually "disabled" when the shell is run in non-interactive mode. I.e.


$ bash <<EOF
alias true=false;
if true; then echo alias disabled; else echo alias enabled; fi
EOF

alias disabled
$ bash -i <<EOF # force interactive mode
alias true=false;
if true; then echo alias disabled; else echo alias enabled; fi
EOF

$ alias true=false;
$ if true; then echo alias disabled; else echo alias enabled; fi
alias enabled
$ exit


To add to the fun of different behaviours, other shells always expand aliases.

If you decide to play with aliases you should note one thing: they are only enabled from the line after the definition. E.g. note the difference below

$ dash -c 'alias foo="echo foo"; foo'
dash: foo: not found
$ dash -c 'alias foo="echo foo";
foo'
foo

September 23, 2013

Worse than no support

You are in need of support and you go ask in project x's forums (mailing list, online board, etc.) and get something worse than no support:

Responses from an incompetent with plenty of time and lots of FUD to spread.

Now that we have "umarell" thanks to Enrico Zini and other Italian folks, we need a name for the above.

Apologies to some readers who might be disappointed by this post. I don't like calling people names but I've seen too many of the above in less than two hours not to be tired.

September 18, 2013

A bashism a week: testing strings

The bashism of this week deals again with tests. How do you compare two strings to tell which one goes before or after the other in the alphabet in a shell script?
(also know as lexicographic comparison)

If you are familiar with perl and the relationship between its comparison operators and that of shell scripts, you would probably write the test as follows:


$ test bar '<' foo
$ test foo '>' bar


And yes, that would work, but not with all shells. I'm afraid to tell you that the < and > comparison operators are not required by the POSIX:2001 specification and are, therefore, bashisms.

What does this have to do with perl? well, the shell way is the inverse of the perl way.
What? In Perl you test the equality of two numbers with '==', in shell you use '-eq'; in perl the equality of strings is 'eq', in shell it is '='. In shell scripts you can compare two numbers with -gt (test 1 -gt 0) while in perl the equivalent comparison is with > (1 > 0).

So in this case the perl way is "bar lt foo" and based on the above the shell way should be "bar < foo". However, care must be taken when using such operators in shell scripts. Since < and > are used for redirections in shell scripts, one must quote them even when using bash. An alternative is to use the [[ special test function of bash which alters the shell syntax.

This time I didn't write a function to portably (or "somewhat portably") replace such comparisons. Feel free to share your solution in the comments, with bonus points if you come up with a solution without using external commands.

September 04, 2013

A bashism a week: tilde expansion

Did you know that you can get a user's home directory with the "~username" tilde expansion? You probably do, but how about other tilde expansions?

Thanks to a few bashisms you can make your script even more difficult to read by using "~+" to get $PWD, and "~-" to get $OLDPWD.

Are you using bash and want to access the directories stack created when using the pushd and popd bashisms? The ~i (where i is an integer) expansion can give you that. It gives you forward (~+i) and backward (~-i) access to the directories stack.

But beware, if the directories stack is smaller than the number you used, there won't be any expansion.

When using tildes in shell scripts, make sure you quote to avoid unwanted expansions. Note that the posh shell in wheezy and older do support those non-POSIX expansions.

in the first examples the tilde is quoted for easier reading; the expansion doesn't occur if the tilde is quoted.

August 28, 2013

Scheduling mistake

Sometimes people make mistakes and things don't go as expected. Eventually, they find out and say oops.

The next blog post for the a bashism a week series was scheduled for publishing on next Wednesday due to a mistake from my side. Sorry about that! given that it's a bit late today, let's better take a break this week.

August 21, 2013

A bashism a week: function names

The bashism of this week is easy to hit when overriding the execution of a command with a shell function. Think of the following example scenario:

Replacing the yes(1) command with a shell function:
$ exec /bin/bash
$ type -t yes
file
$ yes() { while :; do echo ${1:-y}; done; }
$ type -t yes
function

Now every time yes is called the above-defined shell function will be called instead of /usr/bin/yes.

Apply the same principle to replace the run-parts(8) command with the following overly-simplified shell function:

$ run-parts() { 
    if [ "$1" = "--test" ]; then
        shift;
        simulate=true;
    else
        simulate=false;
    fi;
    for f in "$1"/*; do
        [ -x "$f" ] && [ -f "$f" ] || continue;
        case "$(basename "$f")" in 
            *[!a-zA-Z0-9_/-]*)
                :
            ;;
            *)
                if $simulate; then
                    echo $f;
                else
                    $f;
                fi
            ;;
        esac;
    done
}
$ type -t run-parts
function
(note the use of negative matching)

It also works as expected. However, when running it under a shell that only supports the function names required by the POSIX:2001 specification it will fail. One such shell is dash, which aborts with a "Syntax error: Bad function name", another is posh which aborts with a "run-parts: invalid function name".

If you ever want to have function names with dashes, equal signs, commas, and other unusual characters make sure you use bash and ksh-like shells (and keep that code to yourself). Yes, you can even have batch-like silencing of stdout with
function @ { "$@" > /dev/null ; }

Update: there were missing quotation marks in the example @ function.

August 14, 2013

A bashism a week is back

After a while without posts on the a bashism a week series, it is coming back!

Next week, at the usual time and day of the week, the series of blog posts about bashisms will be back for at least one more month. Subscribe via Atom and don't miss any post and check all the previous posts.

The a bashism a week series cover some of the differences between bash and the behavior of other shells, and the requirements by the POSIX standard regarding shell scripting. Or put simply: they are a guide to common bashisms, allowing you to identify them and avoid their use for a more compatible and portable code.

Happy reading!

July 31, 2013

Ten years-old ebook reader

I own a ten years-old ebook reader.

Is it a smartphone?
It has features that make it pretty much like one, except that it can't make phone calls. So no. (However, I dare to say that making phone calls is one of the least used features of smartphones nowadays. Perhaps in the feature smartphones won't even be phones anymore, due to atrophy.)

What is it then?
It is a Sony CLIÉ, a PEG-SJ22 to be more precise. A PDA running Palm OS 4.1 that I'm now using as an ebook reader. To my surprise, the Plucker and iSilo readers still exist and at least the latter seems somewhat alive - there is even a version for Android.

Nowadays there are ebook readers with electronic paper displays, wifi or 3G connectivity, and other features but they all come down to the same thing: an ebook reader. Truth be told, the technology is quite old. In 2004, a year after the release of the PEG-SJ22, Sony also released the LIBRIé EBR-1000EP in Japan, an ebook reader with a 6" electronic paper display. A few years later is was released in the US as the Sony Reader.

Certainly, there have been advances since those devices were first released, but I have yet to see something that is really innovating. This year we are back to wrist watches, which were first released more than ten years ago - and one of them even ran linux.

Netbooks also had devices like the PEG-UX50 as their ancestor.

And if you were thinking about shoes, you've arrived late: the Puma RS already had some chips in them, and the Adidas 1 were also sported a few years ago. World, it's time to innovate.

July 09, 2013

Explaining segmentation fault errors

Want to fix that segfault you keep hitting or was reported to you? The first step is to understand the error message you get.

So you have a message like the following:
segfault at bfea3fec ip 080ee07e sp bfea3fa0 error 6

You might already know that ip means instruction pointer and sp means stack pointer and as such the addresses that follow them are the values in those registers. But what does the error number mean?

The error number, or code, actually gives you a better explanation of what the cause of the segfault is. The number's bits are flags describing the error and are architecture-dependent. For x86/x86_64 I just wrote an online converter/decoder that you can use to explain the segfault error code.

As an example, the above error code is explained as:
The cause was a user-mode write resulting in no page being found.

And the common error 4:
The cause was a user-mode read resulting in no page being found.
(also known as a null pointer dereference).

Enjoy.

June 12, 2013

Service update: 5 million a day

After the release of Debian wheezy traffic jumped to about 1 million requests per day, but as the weeks have passed by traffic has continued to increase to 5 million requests every day.

Even though it is a new record for the redirector it can not yet be compared to openSUSE.org's 20-40 million on their mirrorbrain instance. Let's see how long it takes to get there.

User adoption has increased but it has yet to become the default mirror in several places.

June 05, 2013

The "let the tool do the work" update

Over the last few weeks I've been making several changes to http.debian.net to detect mirrors that don't follow Debian's mirroring guidelines and end up causing problems to the end users. The changes will mean less hash mismatches and similar errors.

As I wrote back in December, the redirector is becoming nicer but also stricter. Some of the changes I recently made caused over 30 mirrors to be completely disabled from the redirector. This is not ideal and I don't like having to disable mirrors. They are contributions afterall.

The only thing I can do, and can't stress enough, is encourage people to use an up to date ftpsync script (available at project/ftpsync/ on every mirror) to mirror Debian.
It takes care for you of all the little but important things needed. Really. A mirror that uses ftpsync is easier for the administrator to properly configure, and provides a consistent mirror for the benefit of the users.

Speaking of ftpsync, there is a new version! If you use ftpsync please upgrade it as soon as possible.

Other improvements are on their way. Contributions are welcome (if you like refactoring, there's quite a bit of explicitly-redundant code in check.pl that should now give a better idea of the way it needs to be refactored.)

May 22, 2013

Dealing with bashisms in proprietary software

Sometimes it happens that for one reason or another there's a need to use a proprietary application (read: can not be modified due to its licence) that contains bashisms. Since the application can not be modified and it might not be desirable to change the default /bin/sh, dealing with such applications can be a pain. Or not.

The switchsh program (available in Debian) by Marco d'Itri can be used to execute said application under a namespace where bash is bind-mounted on /bin/sh. The result:

$ sh --help
sh: Illegal option --
$ switchsh sh --help | head -n1
GNU bash, version 4.1.5(1)-release-(i486-pc-linux-gnu)

Simple, yet handy.

May 08, 2013

Almost one million requests per day

In the first 48 hours after its log files were rotated last Sunday, http.debian.net handled almost 2 million requests, for an average of 11 requests per second.

In the last weeks before the release of Debian wheezy the number of requests had dropped slightly below 2 million per week.

Debian is alive.

May 04, 2013

A single address to get Debian Wheezy while it's hot

Already preparing to install or to upgrade to Debian Wheezy?

You can use the http.debian.net redirector to install Debian Wheezy or upgrade to it from Squeeze and make use of Debian's ever-growing 370-large mirrors network to get it.

APT one liner (to be used in your /etc/apt/sources.list file):
deb http://http.debian.net/debian wheezy main

During the installation process you can also choose to use it by manually entering http.debian.net as an HTTP mirror and /debian/ as the path.

Get it while it's hot!

May 02, 2013

An ever-growing mirrors network, a year later

A year ago I wrote about Debian's ever-growing mirrors network, so it is time to review the numbers.

Compared to the numbers from last year, today Debian is being served via http by about 370 mirrors world-wide, and is also available via ftp from 330 mirrors. So that's an increase of 40 mirrors in one year!

The number of countries with Debian mirrors also increased to 76, 3 more since last year.

This has only been possible thanks to the sponsors hosting the mirrors.
During this year some sponsors have had to retire their mirrors, sometimes ceasing years of contributions to the project and its community.

A big thanks is deserved to past and current sponsors.

April 08, 2013

How the world ended up in Costa Rica

Even though I haven't had much time to dedicate to http.debian.net lately, it has been up and running, or should I say serving?

Part of its job is to detect mirrors that have temporary issues or are entirely gone, down, unavailable. It does so, and many other things, by monitoring the so-called "trace files". A very important one being the "master" (or "origin") trace file.

With the recent integration of backports into the main archive, the master trace file of the backports mirrors also changed. Long story short, this change caused backports mirrors to no longer be considered by the mirror redirector as candidates. As long as they were up to date.

After the usual mirror synchronisation delay, more and more mirrors were disabled and subsets of "up to date" candidates re-calculated. This reached a critical point when only one mirror was left in the database. The mirror had not been synchronised for a couple of weeks.

This mirror is located in Costa Rica, and as the only candidate left in the database it was the only one used to serve requests for the backports archive. No matter where the client was located in the world.

The issue was later noticed and the necessary updates to the mirrors master list made. Mirrors started to be re-considered as they were re-checked (with some delay due to the rate limiter) and the subsets re-calculated. In a few hours everything was back to normality.

Correctness and fault-tolerance don't always get together very well...

March 29, 2013

Chocolate quote

Kinda appropriate for some recent events:
Il y a autant de générosité à recevoir qu'à donner
- Julien Green

Roughly translated to
There is only as much generosity to receive as there is to give.

March 27, 2013

A bashism a week: substrings (dynamic offset and/or length)

Last week I talked about the substring expansion bashism and left writing a portable replacement of dynamic offset and/or length substring expansion as an exercise for the readers.

The following was part of the original blog post, but it was too long to have everything in one blog post. So here is one way to portably replace said code.

Let's consider that you have the file name foo_1.23-1.dsc of a given Debian source package; you could easily find its location under the pool/ directory with the following non-portable code:
file=foo_1.23-1.dsc
echo ${file:0:1}/${file%%_*}/$file

Which can be re-written with the following, portable, code:
file=foo_1.23-1.dsc
echo ${file%${file#?}}/${file%%_*}/$file

Now, in the Debian archive source packages with names with the lib prefix are further split, so the code would need to take that into consideration if file is libbar_3.2-1.dsc.

Here's a non-portable way to do it:
file=libbar_3.2-1.dsc
if [ lib = "${file:0:3}" ]; then
    length=4
else
    length=1
fi

# Note the use of a dynamic length:
echo ${file:0:$length}/${file%%_*}/$file

While here's one portable way to do it:
file=libbar_3.2-1.dsc
case "$file" in
    lib*)
        length=4
    ;;
    *)
        length=1
    ;;
esac

length_pattern=
while [ 0 -lt $length ]; do
    length_pattern="${length_pattern}?"
    length=$(($length-1))
done

echo ${file%${file#$length_pattern}}/${file%%_*}/$file

The idea is to compute the number of interrogation marks needed and use them where needed. Here are two functions that can replace substring expansion as long as values are not negative (which are also supported by bash.)

genpattern() {
    local pat=
    local i="${1:-0}"

    while [ 0 -lt $i ]; do
        pat="${pat}?"
        i=$(($i-1))
    done
    printf %s "$pat"
}

substr() {
    local str="${1:-}"
    local offset="${2:-0}"
    local length="${3:-0}"

    if [ 0 -lt $offset ]; then
        str="${str#$(genpattern $offset)}"
        length="$((${#str} - $length))"
    fi

    printf %s "${str%${str#$(genpattern $length)}}"
}

Note that it uses local variables to avoid polluting global variables. Local variables are not required by POSIX:2001.

Enough about substrings!

Remember, if you rely on non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

March 20, 2013

A bashism a week: substrings

Sometimes obtaining a substring in a shell script is needed. The bashism of this week comes handy as it allows one to obtain a substring by indicating the offset and even the length of the substring. This is the ${varname:offset:length} bashism, also known as substring expansion.

The portable "replacements" are simple if the offset (and the length) are static. For example, the following code would print the substring of "foo" consisting of only the last two characters:
var=foo
# Replace the bashism ${var:1} with:
echo ${var#?}

The length can then be limited with additional pattern-matching removal expansions:
var="portable code"
# Replace the bashism ${var:3:5} with the following code

# Offset is 3, so we use three ? (interrogation) characters:
part=${var#???}

# Length is 5, so we use five ? characters:
echo ${part%${part#?????}}

As it can be seen, it is not impossible to replace a substring expansion.

The portable code becomes slightly more complex if the offset and/or the length are dynamic. I leave that as an exercise for the readers.

Feel free to post your code as a comment (use the <pre> tags, please) or in another public way. My own response is already scheduled to be published next week at the same time as usual.

Note: substring expansions can also be replaced with a wide variety of external commands. This is a pure-POSIX shell scripting example.

March 13, 2013

A bashism a week: assigning to variables and special built-ins

Assigning a value to a variable when executing a command is a way to populate the command's environment, without the variable assignment persisting after the command completes. This is not true, however, when a special built-in is the command being executed.

POSIX:2001 states that "Variable assignments specified with special built-in utilities remain in effect after the built-in completes".

Not only this is tricky because it depends on whether a utility is a special built-in or not, but the bash interpreter does not respect that behaviour of the POSIX standard. That is, special built-ins are not so "special" to the bash interpreter.

This leaves two things to take into account when assigning to a variable when executing a command: whether the command is a special built-in, and whether bash is interpreting the script.

Now, the list of special built-ins is rather short and it would be a bit unusual to perform variable assignments when calling them, except for some cases: "exec", "eval", "." (dot), and ":" (colon).

It is important to note that ":" and "true" differ in this regard; the former is a special built-in, the latter is just a utility. Watch out for this kind of differences when using ":" or "true" to nullify a command. E.g.

Compare
$ dash -c '
method=sed
# some condition or user setting ends up making:
method=true
# later:
foo=bar $method
echo foo: $foo'
foo: 
To (redacted for brevity):
$ dash -c '
method=:
foo=bar $method
echo foo: $foo'
foo: bar

March 06, 2013

A bashism a week: returning

Inspired by Thorsten Glaser's comment about where you can break from, this "bashism a week" is about a behaviour not implemented by bash.

return is a special built-in utility, and it should only be used on functions and scripts executed by the dot utility. That's what the POSIX:2001 specification requires.

If you return from any other scope, for example by accidentally calling it from a script that was not sourced but executed directly, the bash shell won't forgive you: it does not abort the execution of commands. This can lead to undesired behaviour.

A wide variety of shell interpreters silently handle such calls to return as if exit had been called.

An easy way to avoid such undesired behaviours is to follow the best practice of setting the e option, i.e.
set -e
. With that option set at the moment of calling return outside of the allowed scopes, bash will abort the execution, as desired.

The POSIX specification does not guarantee the above behaviour either as the result in such cases is "unspecified", however.

February 27, 2013

A bashism a week: appending

The very well known appending operator += is a bashism commonly found in the wild. Even though it can be used for things such as adding to integers (when the variable is declared as such) or appending to arrays, it is usually used for appending to a string variable.

As I previously blogged about it, the appending operator bashism is only useful when programming for the bash shell.

Whenever you want to append a string to a variable, repeating the name of the variable is the portable way. I.e.
foo=foo
foo="${foo} bar"
# Instead of foo+=" bar", which is a bashism

See? Replacing the += operator is not rocket science.

Note: One should be aware that makefiles do have a += operator which is safe to use when appending to a make variable. But don't let this "exception" fool you: code in configure.ac and similar files is executed by the shell interpreter. So don't use the appending operator there.

February 25, 2013

A tale of a bug report

Part 1:
A bug report is filed.
Part 2:
A patch is later provided by the submitter.
Part 3:
The patch is added to the package, the bug gets fixed.

[some time later]

Part 4:
A new upstream version is released, the patch is dropped.
Part 5:
The bug report is filed, again.

February 20, 2013

A bashism a week: pushing and pop'ing directories

Want to switch back-and-forth between directories in your shell script?
The bashism of this week can be of some help, but for most needs, the cd utility is more than enough.

pushd, popd, and the extra built-in dirs are bashisms that allow one to create and manipulate a stack of directory entries. For a simple, temporary, switch of directories the following code is portable as far as POSIX:2001 is concerned:

cd /some/directory
  touch some files
  unlink others
  # etc
cd - >/dev/null
# We are now back at where we were before the first 'cd'

Which is equivalent to the following, also portable, code:

cd /some/directory
  touch some files
  unlink others
  # etc
cd "$OLDPWD"
# We are now back at where we were before the first 'cd'

Multiple switches can also be implemented portably without storing the name of the directories in variables at the expense of using subshells (and their side-effects).

However, if you think you can solve your problem more conveniently by using "pushd" and "popd" don't forget to document the need of those built-ins and to adjust the shebang of your script to that of a shell that implements them, such as bash.

February 13, 2013

A bashism a week: negative matches

Probably due to the popular way of expressing the negation of a character class in regular expressions, it is common to see negative patterns such as [^b] in shell scripts.

However, using an expression such as [^b] where the shell is the one processing the pattern will cause trouble with shells that don't support that extension. The right way to express the negation is using an exclamation mark, as in: [!b]

Big fat note: this only applies to patterns that the shell is responsible for processing. Some of such cases are:

case foo in
    [!m]oo)
        echo bar
    ;;
esac
and
# everything but backups:
for file in documents/*[!~]; do
    echo doing something with "$file" ...
done

If the pattern is processed by another program, beware that most won't interpret the exclamation the way the shell does. E.g.

$ printf "foo\nbar\nbaz\n" | grep '^[^b]'
foo
$ printf "foo\nbar\nbaz\n" | grep '^[!b]'
bar
baz

February 06, 2013

A bashism a week: short-circuiting tests

The test/[ command is home to several bashisms, but as I believe I have demonstrated: incompatible behaviour is to be expected.

The "-a" and "-o" binary logical operators are no exception, even if documented by the Debian Policy Manual.

One feature of writing something like the following code, is that upon success of the first command, the second won't be executed: it will be short-circuited.
[ -e /dev/urandom ] || [ -e /dev/random ]

Now, using the "-a" or "-o" bashisms even in shell interpreters that support them can result in unexpected behaviour: some interpreters will short-circuit the second test, others won't.

For example, bash doesn't short-circuit:
$ strace bash -c '[ -e /dev/urandom -o -e /dev/random ]' 2>&1|grep /dev
stat64("/dev/urandom", ...) = 0
stat64("/dev/random", ...) = 0
Neither does dash:
$ strace dash -c '[ -e /dev/urandom -o -e /dev/random ]' 2>&1|grep /dev
stat64("/dev/urandom", ...) = 0
stat64("/dev/random", ...) = 0
But posh does:
$ strace posh -c '[ -e /dev/urandom -o -e /dev/random ]' 2>&1|grep /dev
stat64("/dev/urandom", ...) = 0
And so does pdksh:
$ strace pdksh -c '[ -e /dev/urandom -o -e /dev/random ]' 2>&1|grep /dev
stat64("/dev/urandom", ...) = 0

output of strace redacted for brevity

So even in Debian, where the feature can be expected to be implemented, its semantics are not very well defined. So much for using this bashism... better avoid it.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

January 30, 2013

A bashism a week: sleep

To delay execution of some commands in a shell script, the sleep command comes handy.
Even though many shells do not provide it as a built-in and the GNU sleep command is used, there are a couple of things to note:

  • Suffixes may not be supported. E.g. 1d (1 day), 2m (2 minutes), 3s (3 seconds), 4h (4 hours).
  • Fractions of units (seconds, by default) may not be supported. E.g. sleeping for 1.5 seconds may not work under all implementations.

This of course is regarding what is required by POSIX:2001; it only requires the sleep command to take an unsigned integer. FreeBSD's sleep command does accept fractions of seconds, for example.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

In this case, since the sleep command is not required to be a built-in, it does not matter what shell you specify in your script's shebang. Moreover, calling /bin/sleep doesn't guarantee you anything. The exception is if you specify a shell that has its own sleep built-in, then you could probably rely on it.

The easiest replacement for suffixes is calculating the desired amount of time in seconds. As for the second case, you may want to reconsider your use of a shell script.

January 23, 2013

A bashism a week: output redirection

Redirecting stdout and stderr to the same file or file descriptor with &> is common and nice, except that it is not required to be supported by POSIX:2001. Moreover, trying to use it with shells not supporting it will do exactly the opposite:

  1. The command's output (to stdout and stderr) won't be redirected anywhere.
  2. The command will be executed in the background.
  3. The file will be truncated, if redirecting to a file and not using >>.

Are the characters saved worth those effects? I don't think so. Just use this instead: "> file 2>&1". Make sure you get the first redirection right, "&> file 2>&1" isn't going to do the trick.

January 22, 2013

The death of the netbooks?

It's been over four years since I bought my ASUS Eee PC 1000h. I have used it almost daily ever since. Back when I bought it new models from different brands were being released every few months due to the netbooks hype.

In spite of being resource-limited due to its 1.60 GHz Atom CPU and only 1 GB of RAM, I've managed to do pretty much everything with it. Building software is slow and watching HD videos is nearly impossible, even more so when streamed from the internet and played with flash. Its limited memory capacity makes the kernel swap tens of megabytes before the KDE4 desktop is fully loaded. After launching some day-to-day applications there are usually hundreds of MBs in swap.

In spite of all this, I run the KDE4 desktop and have been able to do things such as running up to two Debian virtual machines with several services (apache httpd, mysql server, openldapd, squid, etc.) and a Windows XP one all at the same time, under virtualbox. I could have probably booted another Debian virtual machine but that would have most likely rendered the DE unusable. Oh, and did I say that this is under the "VT-x"-less N270 CPU?

This so-called netbook has proved to be rock-solid. Every component is still fully functional except for its 7-hours lasting battery that didn't stand a full year of day-to-day use. The keyboard is still intact and so is everything else.

Last year I thought I was going to have to seriously consider buying a replacement after seeing what I think are some signs of the end of its life: After a routinely deep cleanup the keyboard stopped working properly, to the point that I couldn't even login because half the keyboard would send the signal of a totally unrelated key. I bought an external, but still small, USB keyboard which I used until after the next deep-cleanup somehow made the built-in keyboard work again.

The second sign came soon after the keyboard issue. The AC adapter was, well, no longer supplying power to the machine. Trying to buy one online proved to be futile. Replacement supplies for ASUS equipment are hard to find here in Mexico and importing them from the US results in the item being twice (or more) as expensive due to import taxes. They are even more expensive when one finally adds up the cost of shipping.

Hopefully, after spending some hours hunting down the failure in the adapter it turned out to be a problem with the wires. Cut wires repaired, the adapter was working again. The unit itself wasn't at fault.

Back to 2013, this netbook is ageing and every time I've looked at potential replacements I've found none that I like. I look for another netbook/ultrabook/laptop/whatever that is rock-solid, with a 10.1" or 11" display, and has a similarly compact but not oh-so-small-that-I-can't-even-type-by-only-using-my-fingertips keyboard.

The only devices that have caught my eye are the ASUS transformers (with the dock). I'm not interested in a device that only has 1 GB of memory and something between 32 to 64 GB of storage, however. I'm limited enough with my eee's 160GB hdd.

For my needs, the pre-installed Android would have to go away and I guess it would be fun to get a transformer to run under a standard Debian linux kernel. Since I'm not interested in doing that kind of kernel work the transformers are out of the question.

Based on this I think I can only partially agree with Russell Coker when he states that
If tablet computers with hardware keyboards replace traditional Netbooks that's not really killing Netbooks but introducing a new version of the same thing.
Tablets with hardware keyboards may, perhaps, be the next generation of the less than 10" netbooks, but to date I've yet to see something with a display smaller than 13" that is an upgrade over the 1000h Eee I own.

January 21, 2013

January's Debian mirrors update

It's been slightly over a month since December's update to http.debian.net. Since then, Debian's mirrors network has grown by 6 more archive mirrors. Many thanks to the Debian sponsors running them!

There are now about 370 archive mirrors serving it over http, an increase of 40 (12%) since April last year. The number of backports mirrors is now at 82, and 25 for archive.debian.org.

On the http.debian.net front there haven't been many changes since last month. Some major changes are in the works, but they didn't make it into January's code update. There were, however, a few issues with one of the hosts during the first couple of days of January. Apologies for the inconveniences it may have caused.

A new version of ftpsync addressing some issues should hopefully be released some time next month. Stay tuned to the debian-mirrors mailing list for a call for testers and probably a survey for mirror administrators.

January 16, 2013

A bashism a week: ulimit

Setting resource limits from a shell script is commonly done with the ulimit command.
Shells provide it as a built-in, if they provide it at all. As far as I know, there is no non-built-in ulimit command. One could be implemented with the Linux-specific prlimit system call, but even that requires a fairly "recent" kernel version (circa 2010).

Depending on the kind of resource you want to limit, you may get away with what some shells such as dash provide: CPU, FSIZE, DATA, STACK, CORE, RSS, MEMLOCK, NPROC, NOFILE, AS, LOCKS. I.e. options tfdscmlpnvw, plus H for the hard limit, S for the soft limit, and a for all. Bash allows other resources to be limited.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time. ulimit is not required by POSIX:2001 to be implemented for the shell.

January 09, 2013

A bashism a week: brace expansion

Brace expansion is well known and handy, but sadly it is not required by POSIX:2001. Shells that don't support it will simply and silently leave it as is.

If you use it to shorten commands, as in "echo Debian GNU/{Linux,kFreeBSD}", you have to spell it out or use some sort of loop.

When using brace expansion for sequences you will usually have to fall back to using the seq command or using loops. "{1..9}" can be replaced with "seq -s ' ' 1 9", "{1..9..2}" to "seq -s ' ' 1 2 9", and so on.
If you use brace expansion for sequences of characters then seq won't be of much help.

I must note that the seq command is not required by POSIX:2001, however.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

January 02, 2013

A bashism a week: read

Whether for interacting with the caller, for reading the output of some command, or a file descriptor in general, the read shell command can be found in many scripts.

Unless you stick to the POSIX:2001-required "read variable_name", possibly with the -r option, you should expect problems.

  • You must always pass the name of a variable, even if you are going to discard its content.
  • Prompts, time outs, changing the input delimiter, and basically any feature other than just reading a line is not required to be supported.

dash, for instance, supports prompts but nothing else.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.