Archive for the '/bin/sh' Category

Steaming!

2015-01-20

Valve’s steam appears to be a package manager for installing Valve software (games). Part of steam on Linux is a shell script: steam.sh.

It turns out, if you’re not careful, if you try and uninstall steam or something… then this innocent 600 line shell script can kind of accidentally DELETE ALL YOUR USER FILES. Ha ha.

Much hilarity in the github issue.

At core the proximate issue is executing this command:

	rm -rf "$STEAMROOT/"*

The problem is that, perhaps in mysterious circumstances, STEAMROOT can be set to the empty string. Which means the command rm -fr "/"* gets executed. Which removes all the files that you have access to on the system (it might take its time doing this).

I’m working off this version of steam.sh.

First off, it’s 600 lines long. That, right there, should set the alarm bells ringing. No shell script should be that long. It’s just not a suitable language for writing more than a few lines in.

set -u, whilst a good idea in a lot of scripts, would not have helped here. As it happens, STEAMROOT is set, but set to the empty string.

"${STEAMROOT:?}", as suggested by one of the commentor’s in github, would have helped. The script will exit if STEAMROOT is unset or set to the empty string.

Immediately before this code there is a comment saying “Scary!”. So that’s another thing. If one programmer thinks the code is scary, then we should probably review the code. And make it less scary. Clearly adding an explicit check that STEAMROOT is set would have helped make it less scary.

It would also be a good idea to add a -- argument to rm to signify the end of the options. Otherwise if STEAMROOT starts with a «-» then it will trigger rm into thinking that it is an option instead of the directory to delete. So we should write:

    rm -fr -- "${STEAMROOT:?}"/*

STEAMROOT is assigned near the beginning of the file:

STEAMROOT="$(cd "${0%/*}" && echo $PWD)"

It is often problematic to use command substitution in an assignment. The problem being that the command inside the round brackets, cd "${0%/*}" && echo $PWD in this case, could fail. The shell script still carries on and assigns the stdout of the command to the variable. And if the command failed and produced no output then STEAMROOT will become the empty string.

Here would be a good place to explicitly check that STEAMROOT is not an empty string. : "${STEAMROOT:?}" will do, but if [ -z "$STEAMROOT" ] ; then exit 99; fi is more explicit.

set -e would have helped. If a command substitution is assigned to a variable and the command fails (exit code != 0) then the assignment statement fails and that will trigger set -e into exiting the script. It’s not ideal error checking, but it is better than nothing.

The code, as described by the comment above it, is trying to find out the location of the script. This is often problematic. There’s no portable way to find out. But as long as you’re in bash, and the script is explicitly is a bash script and uses various bashims, why not just use the relatively straightforward DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd ) as recommended by this Stack Overflow answer. No need to pretend that $0 is set to anything useful. (all of the above still applies though)

The steam.sh script is a bit enigmatic. Bits of it are written by someone who clearly knows shell scripting. The "{$0%/*"} thing to strip off the last component of a path is not common knowledge. But why not use dirname as the code later on in the script does? Correctly uses the portable equality operator single «=» in code like if [ "$STEAMEXE" = "steamcmd" ], but later on uses the bashism «==» and «[[». Clearly knows about the $( ... ) notation for command substitution, but then uses legacy (and yukhy) backquote syntax elsewhere. Carefully avoids using dirname (in the POSIX standard, and therefore very likely to be installed on any given Unix system), but then uses curl without checking (and curl isn’t installed on Ubuntu by default).

In summary: too long; attempting to locate directory containing script is problematic; doesn’t do enough checking (in particular, set -e).

shell booleans are commands!

2014-10-03

How should we represent boolean flags in shell? A common approach, possibly inspired by C, is to set the variable to either 0 or 1.

Then you see code like this:


if [ $debug = 1 ]; then
...

or this example from zgrep:


if test $have_pat -eq 0; then
...

there is nothing special about 0 and 1, they are just two strings for repreresenting “the flag is set” and “the flag is unset”.

Test for strings is surprisingly awkward in shell. In Python you can go if debug: .... It would be nice if we could do something similar in shell:


if $debug ; then
...
fi

Well we can. In a shell if statement, if thing, the thing is just a command. If we arrange that debug is either true or false, then if $debug will run either the command true or the command false.


debug=true # sets flag
debug=false # unsets flag

I wish I could remember who I learnt this trick off because I think it’s super cool, and not enough shell programmers know about it. true and false are pretty much self explanatory as boolean values, and no extra code is needed because they already exist as shell commands.

You can also use this with &&:


$debug && stuff

Sometimes shell scripts have a convention where a variable is either unset (to mean false) or set to anything (to mean true). You can convert from this convention to the true/false convention with 2 lines of code:


# if foo is set to anything, set it to "true"
foo=${foo+true}
# if foo is the empty string, set it to "false"
foo=${foo:-false}

bash functions: it is mad!

2014-10-02

bash can export functions to the environment. A consequence of this is that bash can import functions from the environment. This leaves us #shellshocked. #shellshock aside, why would anyone want to do this? As I Jackson says: “it is mad”.

Exporting bash functions allows a bash script, or me typing away at a terminal in bash, to affect the behaviour of basically any other bash program I run. Or a bash program that is run by any other program.

For example, let’s say I write a program in C that prints out the help for the zgrep program:

#include <stdlib.h>

int main(void)
{
    return system("zgrep --help");
}

This is obviously just a toy example, but it’s not unusual for Unix programs to call other programs to do something. Here it is in action:


drj$ ./zgrephelp
Usage: /bin/zgrep [OPTION]... [-e] PATTERN [FILE]...
Look for instances of PATTERN in the input FILEs, using their
uncompressed contents if they are compressed.

OPTIONs are the same as for 'grep'.

Report bugs to <bug-gzip@gnu.org>.

Now, let’s say I define a function called test in my interactive bash session:


test () { bob ; }

This is unwise (test is the name of a well know Unix utility), but so far only harmful to myself. If I try and use test in my interactive session, things go a bit weird:


drj$ test -e /etc/passwd
The program 'bob' is currently not installed. You can install it by typing:
sudo apt-get install python-sponge

but at least I can use bash in other processes and it works fine:


drj$ bash -c 'test -e /etc/passwd' ; echo $?
0

What happens if I export the function test to the environment?


drj$ export -f test
drj$ ./zgrephelp
/bin/zgrep: bob: command not found
/bin/zgrep: bob: command not found
/bin/zgrep: bob: command not found
/bin/zgrep: bob: command not found
/bin/zgrep: bob: command not found
gzip: /bin/zgrep: bob: command not found
--help.gz: No such file or directory
/bin/zgrep: bob: command not found
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
/bin/zgrep: bob: command not found
/bin/zgrep: bob: command not found
/bin/zgrep: bob: command not found

zgrephelp stops working. Remember, zgrephelp is written in C! Of course, zgrephelp runs the program zgrep which is written in… bash! (on my Ubuntu system).

Exporting a function can affect the behaviour of any bash script that you run, including bash scripts that are run on your behalf by other programs, even if you never knew about them, and never knew they were bash scripts. Did you know /bin/zcat is a bash script? (on Ubuntu)

How is this ever useful? Can you ever safely export a function? No, not really. Let’s say you export a function called X. Package Y might install a binary called X and a bash script Z that calls X. Now you you’ve broken Z. So you can’t export a function if it has the same name as a binary installed by any package that you ever might install (including packages that are you never use directly but are installed merely to compile some package that you do want to use).

Let’s flip this around, and consider the import side.

When a bash script starts, before it’s read a single line of your script it will import functions from the environment. These are just environment variables of the form BASH_FUNC_something()=() { function defintion here ; }. You don’t have to create those by exporting a function, you can just create an environment variable of the right form:


drj$ env 'BASH_FUNC_foo()=() { baabaa ; }' bash -c foo
bash: baabaa: command not found

Imagine you are writing a bash script and you are a conscientious programmer (this requires a large amount of my imagination). bash will import potentially arbitrary functions, with arbitrary names, from the environment. Can you prevent these functions being defined?

It would appear not.

Any bash script should carefully unset -f prog for each prog that it might call (including builtins like cd; yes you can define a function called cd).

Except of course, that you can’t do this if unset has been defined as a function.

Why is exporting functions ever useful?

How to patch bash

2014-09-26

3rd in what is now becoming a series on ShellShock. 1st one: 10 tips for turning bash scripts into portable POSIX scripts; 2nd one: why big is bad.

ShellShock is a remote code exploit in /bin/bash (when used in conjunction with other system components). It relies on the way bash exports functions to its environment. If you run the command «env - bash -c 'foo () { hello ; } ; export -f foo ; env'» you can see how this works ordinarily:


PWD=/home/drj/bash-4.2/debian/patches
SHLVL=1
foo=() { hello
}
_=/usr/bin/env

The function foo when exported to the environment turns into an environment variable called foo that starts with «() {». When a new bash process starts it scans its environment to see if there are any functions to import and it is that sequence of characters, «() {», that triggers the import. It imports a function by executing its definition which causes the function to be defined.

The ShellShock bug is that there is a problem in the way it parses the function out of the environment which causes it to execute any code after the function definition to be also executed. That’s bad.

So the problem is with parsing the function definition out of the environment.

Fast forward 22 years after this code was written. That’s present day. S Chazelas discovers that this behaviour can lead to a serious exploit. A patch is issued.

Is this patch a 77 line monster that adds new functionality to the parser?

Why, yes. It is.

Clearly function parsing in bash is already a delicate area, that’s why there’s a bug in it. That should be a warning. The fix is not to go poking about inside the parser.

The fix, as suggested by I Jackson, is to “Disable exporting shell functions because they are mad“. It’s a 4 line patch (well, morally 4 lines) that just entirely removes the notion of importing functions from the environment:

--- a/variables.c
+++ b/variables.c
@@ -347,6 +347,7 @@ initialize_shell_variables (env, privmode)
 
       temp_var = (SHELL_VAR *)NULL;
 
+#if 0 /* Disable exporting shell functions because they are mad. */
       /* If exported function, define it now.  Don't import functions from
         the environment in privileged mode. */
       if (privmode == 0 && read_but_dont_execute == 0 && STREQN ("() {", string, 4))
@@ -380,6 +381,9 @@ initialize_shell_variables (env, privmode)
              report_error (_("error importing function definition for `%s'"), name);
            }
        }
+#else
+      if (0) ; /* needed for syntax */
+#endif
 #if defined (ARRAY_VARS)
 #  if ARRAY_EXPORT
       /* Array variables may not yet be exported. */

In a situation where you want to safely patch a critical piece of infrastructure you do the simplest thing that will work. Later on you can relax with a martini and recraft that parser. That’s why Jackson’s patch is the right one.

Shortly after the embargo was lifted on the ShellShock vulnerability and the world got to know about it and see the patch, someone discovered a bug in the patch and so we have another CVN issued and another patch that pokes about with the parser.

That simply wouldn’t have happened if we’d cut off the head of the monster and burned it with fire. /bin/bash is too big. It’s not possibly to reliably patch it. (and just in case you thinking making patches is easy I had to update this article to point to Jackson’s corrected patch)

Do you really think this is the last patch?

Why big is bad

2014-09-26

If you know me, you know that I don’t like using /bin/bash for scripting. It’s not that hard to write scripts that are portable, and my earlier “10 tips” article might help.

Why don’t I like /bin/bash? There are many reasons, but it’s mostly about size.


drj$ ls -lL $(which sh bash)
-rwxr-xr-x 1 root root 959120 Sep 22 21:39 /bin/bash
-rwxr-xr-x 1 root root 109768 Mar 29 2012 /bin/sh

/bin/bash is nearly 10 times the size of /bin/sh (which in this case, is dash). It’s bigger because it’s loaded with features that you probably don’t need. An interactive editor (two in fact). That’s great for interactive use, but it’s just a burden for non-interactive scripts. Arrays. Arrays are really super useful and fundamental to many algorithms. In a real programming language. If you need arrays, it’s time for your script to grow up and become a program, in Python, Lua, Go, or somesuch.

Ditto job control.
Ditto Extended Regular Expression matching.
Ditto mapfile.
Ditto a random number generator.
Ditto a TCP/IP stack.

You might think that these things can’t harm you if you don’t use them. That’s not true. We have a little bit of harm just by being bigger. When one thing is 10 times bigger than it needs to be, no one will notice. When everything is 10 times bigger than it needs to be then it’s wasteful, and extremely difficult to fix. These features take up namespace. Got a shell script called source or complete? Can’t use it, those are builtins in bash. They slow things down. Normally I wouldn’t mention speed, but 8 years ago Ubuntu switched from bash to dash for the standard /bin/sh and the speed increase was enough to affect boot time. Probably part of the reason that bash is slower is simply that it’s bigger. There are more things it has to do or check even though you’re not making use of those features.

If you’re unlucky a feature you’ve never heard of and don’t use will interact with another feature or a part of your system and surprise you. If you’re really unlucky it will be a remote code exploit so easy to use you can tweet exploits, which is what ShellShock is. Did you know you can have functions in bash? Did you know you can export them to the environment? Did you know that the export feature works by executing the definition of the function? Did you know that it’s buggy and can execute more than bash expected? Did you know that with CGI you can set environment variables to arbitrary strings?

There are lots of little pieces to reason about when considering the ShellShock bug because bash is big. And that’s after we know about the bug. What about all those features of you don’t use and don’t even know about? Have you read and understood the bash man page? Well, those features you’ve never heard of are probably about as secure as the feature that exports functions to the environment, a feature that few people know about, and fewer people use (and in my opinion, no one should use).

The most important thing about security is attitude. It’s okay to have the attitude that a shell should have lots of useful interactive features; it’s arguable that a shell should have a rich programming environment that includes arrays and hash tables.

It’s not okay to argue that this piece of bloatware should be installed as the standard system shell.

10 tips for turning bash scripts into portable POSIX scripts

2014-09-26

In the light of ShellShock you might be wondering whether you really need bash at all. A lot of things that are bash-specific have portable alternatives that are generally only a little bit less convenient. Here’s a few tips:

1. Avoid «[[»

bash-specific:

if [[ $1 = yes ]]

Portable:

if [ "$1" = "yes" ]

Due to problematic shell tokenisation rules, Korn introduced the «[[» syntax into ksh in 1988 and bash copied it. But it’s never made it into the POSIX specs, so you should stick with the traditional single square bracket. Ahttps://drj11.wordpress.com/2014/09/26/10-tips-for-turning-bash-scripts-into-portable-posix-scripts/s long as you double quote all the things, you’ll be fine.

doublequote

2. Avoid «==» for testing for equality

bash-specific:

if [ "$1" == "yes" ]

Portable:

if [ "$1" = "yes" ]

The double equals operator, «==», is a bit too easy to use accidentally for old-school C programmers. It’s not in the POSIX spec, and the portable operator is single equals, «=», which works in all shells.

Technically when using «==» the thing on the right is a pattern. If you see something like this: «[[ $- == *i* ]]» then see tip 8 below.

3. Avoid «cd -»

bash-specific:

cd -

ksh and bash:

cd ~-

You tend to only see «cd -» used interactively or in weird things like install scripts. It means cd back to the previous place.

Often you can use a subshell instead:

... do some stuff in the original working directory
( # subshell
cd newplace
... do some stuff in the newplace
) # popping out of the subshell
... do some more stuff in the original working directory

But if you can’t use a subshell then you can always store the current directory in a variable:

old=$(pwd)

then do «cd “$old”» when you need to go there. If you must cling to the «cd -» style then at least consider replacing it with «cd ~-» which works in ksh as well as bash and is blessed as an allowed extention by POSIX.

4. Avoid «&>»

bash-specific:

ls &> /dev/null

Portable:

ls > /dev/null 2&>1

You can afford to take the time to do the extra typing. Is there some reason why you have to type this script in as quickly as possible?

5. Avoid «|&»

bash-specific:

ls xt yt |& tee log

Portable:

ls xt yt 2>&1 | tee log

This is a variant on using «&>» for redirection. It’s a pipeline that pipes both stdout and stderr through the pipe. The portable version is only a little bit more typing.

6. Avoid «function»

bash-specific:

function foo { ... }

Portable:

foo () { ... }

Don’t forget, you can’t export these to the environment. *snigger*

7. Avoid «((»

bash-specific:

((x = x + 1))

Portable:

x=$((x + 1))

The «((» syntax was another thing introduced by Korn and copied into bash.

8. Avoid using «==» for pattern matching

bash-specific:

if [[ $- == *i* ]]; then ... ; fi

Portable:

case $- in (*i*) ... ;; esac

9. Avoid «$’something’»

bash-specific:

nl=$'\n'

Portable:

nl='
'

You may not know that you can just include newlines in strings. Yes, it looks ugly, but it’s totally portable.

If you’re trying to get even more bizarre characters like ISO 646 ESC into a string you may need to investigate printf:

Esc=$(printf '\33')

or you can just type the ESC character right into the middle of your script (you might find Ctrl-V helpful in this case). Word of caution if using printf: while octal escapes are portable POSIX syntax, \377, hex escapes are not.

10. «$PWD» is okay

A previous version of this article said to avoid «$PWD» because I had been avoiding it since the dark ages (there was a time when some shells didn’t implement it and some did).

Conclusion

Most of these are fairly simple replacements. The simpler tokenisation that Korn introduced for «[[» and «((» is welcome, but it comes at the price of portability. I suspect that most of the bash-specific features are introduced into scripts unwittingly. If more people knew about portable shell programming, we might see more portable shell scripts.

I’m sure there are more, and I welcome suggestions in the comments or on Twitter.

Thanks to Gareth Rees who suggested minor changes to #3 and #7 and eliminated #10.

Piping into shell may be harmful

2014-03-19

Consider

curl https://thing | sh

It has become fashionable to see this kind of shell statement as a quick way of installing various bits of software (nvm, docker, salt).

This is a bad idea.

Imagine that due to bad luck the remote server crashes halfway through sending the text of the shell script, and a line that reads

rm -fr /usr/local/go

gets truncated and now reads:

rm -fr /

You’re hosed.

curl may write a partial file to the pipe and sh has no way of knowing.

Can we defend against this?

Initially I was pessimistic. How can we consider all possible truncations of a shell program? But then I realised, after a couple of false turns, that it’s possible to construct a shell script that is syntactically valid only when the entire script has been transmitted; truncating this special shell script at any point will result in a syntactically invalid script which shell will refuse to execute.

Moreover this useful property of being syntactically valid only when complete is not just a property of a small specially selected set of shell scripts, it’s entirely general. It turns out to be possible to transform any syntactically valid shell script into one that has the useful property:

{
...
any old script here
...
}

We can bracket the entire script with the grouping keyword { and }. Now if the script gets cutoff somewhere in the middle, the { at the beginning will be missing its matching } and be syntactically invalid. Shell won’t execute the partial script.

As long as the script in the middle is syntactically valid, then the bracketed script will be syntactically valid too.

Let’s call these curly brackets that protect the script from truncation, Jones truncation armour.

Clearly Jones truncation armour should be applied to all scripts that may be piped directly into shell.

Can it be applied as an aftermarket add-on? Yes it can!

{ echo { && curl https://thing && echo } ; } | sh

Maybe this is even better. It means that the consumer of the script doesn’t have to rely on the provider to add the Jones truncation armour. But it also doesn’t matter if the script is already armoured. It still works.

Making change with shell

2012-03-13

I was flicking through Wikström’s «Functional Programming Using Standard ML», when I noticed he describes the problem of making up change for an amount of money m using coins of certain denominations (18.1.2, page 233). He says we “want a function change that given an amount finds the smallest number of coins that adds to that amount”, and “Obviously, you first select the biggest possible coin”. Here’s his solution in ML:

exception change;
fun change 0 cs = nul
  | change m nil = raise change
  | change m (ccs as c::cs) = if m >= c
      then c::change (m-c) ccs
      else change m cs;

It’s pretty neat. The recursion proceeds by either reducing the magnitude of the first argument (the amount we are giving change for), or reducing the size of the list that is the second argument (the denominations of the coins we can use); so we can tell that the recursion must terminate. Yay.

It’s not right though. Well, it gives correct change, but it doesn’t necessarily find the solution with fewest number of coins. Actually, it depends on the denominations of coins in our currency; probably for real currencies the “biggest coin first” algorithm does in fact give the fewest number of coins, but consider the currency used on the island of san side-effect, the lambda. lambdas come in coins of Λ1, Λ10, and Λ25 (that’s not a wedge, it’s a capital lambda. It’s definitely not a fake A).

How do we give change of Λ30? { Λ10, Λ10, Λ10 } (3 tens); what does Wikström’s algorithm give? 1 twenty-five and 5 ones. Oops.

I didn’t work out a witty solution to the fewest number of coins change, but I did create, in shell, a function that lists all the possible ways of making change. Apart from trivial syntactic changes, it’s not so different from the ML:

_change () {
    # $1 is the amount to be changed;
    # $2 is the largest coin;
    # $3 is a comma separated list of the remaining coins

    if [ "$1" -eq 0 ] ; then
        echo ; return
    fi
    if [ -z "$2" ] ; then
        return
    fi
    _change $1 ${3%%,*} ${3#*,}
    if [ "$1" -lt "$2" ] ; then
        return
    fi
    _change $(($1-$2)) $2 $3 |
        while read a ; do
            echo $2 $a
        done
}

change () {
    _change $1 ${2%%,*} ${2#*,},
}

Each solution is output as a single line with the coins used in a space separated list. change is a wrapper around _change which does the actual work. The two base cases are basically identical: «”$1″ -eq 0» is when we have zero change to give, and we output an empty line (just a bare echo) which is our representation for the empty list; «-z “$2″» is when the second argument (the first element of the list of coins) is empty, and, instead of raising an exception, we simply return without output any list at all.

The algorithm to generate all possible combinations of change is only very slightly different from Wikström’s: if we can use the largest coin, then we generate change both without using the largest coin (first recursive call to _change, on line 12) and using the largest coin (second recursive call to _change, on line 16). See how we use a while loop to prepend (cons, if you will) the largest coin value to each list returned by the second recursive call. Of course, when the largest coin is too large then we proceed without it, and we only have the first recursive call.

The list of coins is managed as two function arguments. $2 is the largest coin, $3 is a comma separated list of the remaining coins (including a trailing comma, added by the wrapping change function). See how, in the first recursive call to _change $3 is decomposed into a head and tail with ${3%%,*} and ${3#*,}. As hinted at in the previous article, «%%» is a greedy match and removes the largest suffix that matches the pattern «,*» which is everything from the first comma to the end of the string, and so it leaves the first number in the comma separated list. «#» is a non-greedy match and removes the smallest prefix that matches the pattern «*,», so it removes the first number and its comma from the list. Note how I am assuming that all the arguments do not contain spaces, so I am being very cavalier with double quotes around my $1 and $2 and so on.

It even works:

$ change 30 25,10,1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
10 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
10 10 1 1 1 1 1 1 1 1 1 1
10 10 10
25 1 1 1 1 1

Taking the bash out of Mark

2012-03-05

Mark Dominus, in his pretty amusing article about exact rational arithmetic in shell gives us this little (and commented!) shell function:

        # given an input number which might be a decimal, convert it to
        # a rational number; set n and d to its numerator and
        # denominator.  For example, 3.3 becomes n=33 and d=10;
        # 17 becomes n=17 and d=1.
        to_rational() {
          # Crapulent bash can't handle decimal numbers, so we will convert
          # the input number to a rational
          if [[ $1 =~ (.*)\.(.*) ]] ; then
              i_part=${BASH_REMATCH[1]}
              f_part=${BASH_REMATCH[2]}
              n="$i_part$f_part";
              d=$(( 10 ** ${#f_part} ))
          else
              n=$1
              d=1
          fi
        }

Since I’m on a Korn overdrive, what would this look like without the bashisms? Dominus uses BASH_REMATCH to split a decimal fraction at the decimal point, thus splitting ‘fff.iii’ into ‘fff’ and ‘iii’. That can be done using portable shell syntax (that is, blessed by the Single Unix Specification) using the ‘%’ and ‘#’ features of parameter expansion. Example:

$ f=3.142
$ echo ${f%.*}
3
$ echo ${f#*.}
142

In shell, «${f}» is the value of the variable (parameter) f; you probably knew that. «${f%pattern}» removes any final part of f that matches pattern (which is a shell pattern, not a regular expression). «${f#pattern}» removes any initial part of f that matches pattern (full technical details: they remove the shortest match; use %% and ## for greedy versions).

Thus, between them «${f%.*}» and «${f#*.}» are the integer part and fractional part (respectively) of the decimal fraction. The only problem is when the number has no decimal point. Well, Dominus special cased that too. Of course the “=~” operator is a bashism (did perl inspire bash, or the other way around?), so portable shell programmers have to use ‘case’ (which traditionally was always preferred even when ‘[‘ could be used because ‘case’ didn’t fork another process). At least this version features a secret owl hidden away (on line 3):

to_rational () {
  case $1 in
    (*.*) i_part=${1%.*} f_part=${1#*.}
      n="$i_part$f_part"
      d=$(( 10 ** ${#f_part} )) ;;
    (*) n=$1 d=1 ;;
  esac
}

The ‘**’ in the arithmetic expression raised a doubt in my mind and, *sigh*, it turns out that it’s not portable either (it does work in ‘ksh’, but it’s not in the Single Unix Specification). Purists have to use a while loop to add a ‘0’ digit for every digit removed from f_part:

to_rational () {
  case $1 in
    (*.*) i_part=${1%.*} f_part=${1#*.}
      n="$i_part$f_part"
      d=1;
      while [ -n "${f_part}" ] ; do
          d=${d}0
          f_part=${f_part%?}
      done ;;
    (*) n=$1 d=1 ;;
  esac
}

Traditional shell didn’t support this «${f%.*}» stuff, but it’s been in Single Unix Specification for ages. It’s been difficult to find a Unix with a shell that didn’t support this syntax since about the year 2000. It’s time to start to be okay about using it.

Interactive Shells and their prompts

2012-03-04

What can we say about what the “-i” option to shell does? It varies according to the shell. Below, I start with PS1 set to “demo$ “, which is not my default. (you can probably work it out from the transcript, but) it might help to know that my default shell is bash (for now). There’s nothing special about ‘pwd’ in the examples below, it’s just a command with a short name that outputs something.

demo$ echo pwd | ksh
/home/drj/hackdexy/content
demo$ echo pwd | bash
/home/drj/hackdexy/content
demo$ echo pwd | ksh -i
$ /home/drj/hackdexy/content
$ 
demo$ echo pwd | bash -i
drj$ pwd
/home/drj/hackdexy/content
drj$ exit
demo$ echo pwd | bash --norc -i
bash-4.2$ pwd
/home/drj/hackdexy/content
bash-4.2$ exit
demo$ echo pwd | sh -i
$ /home/drj/hackdexy/content
$ 
sh: Cannot set tty process group (No such process)
demo$ . ../bin/activate
(hackdexy)demo$ echo pwd | ksh -i
(hackdexy)demo$ /home/drj/hackdexy/content
(hackdexy)demo$ 
(hackdexy)demo$ deactivate

What have we learnt? (when stdin is not a terminal) shells do not issue prompts unless the “-i” option is used. That’s basically what “-i” does. bash, but not ksh, will echo the command. That has the effect of making bash’s output seem more like an interactive terminal session.

Both ‘bash’ and ‘ksh’ changed the prompt. ‘bash’ changed the prompt because it sources my ‘.bashrc’ and that sets PS1; ‘ksh’ changed my prompt, apparently because PS1 is not an exported shell variable, and so ‘ksh’ does not inherit PS1 from its parent process, and so sets it to the default of “$ “. If we stop ‘bash’ from sourcing ‘.bashrc’ by using the ‘–norc’ option (‘–posix’ will do as well) then it too will use its default prompt: the narcissistic “bash-4.2$ “.

‘sh’ (dash on my Ubuntu laptop, apparently), is like ‘ksh’ in that it does not echo the commands. But it does output a narked warning message.

The last command, after using ‘activate’ from virtualenv, demonstrates that ‘activate’ must export PS1. I think that is a bug, but I would welcome comment on this matter.

I guess most people use ‘bash’ and mostly set PS1 their ‘.bashrc’, so wouldn’t notice any of this subtlety (for example, they will not notice that ‘activate’ exports PS1 because the ‘.bashrc’ will set it to something different).

I note that the example login and ENV scripts given in The KornShell Command and Programming Language (p241 1989 edition) set PS1 is the login script only, and do not export it. Meaning that subshells will have the default prompt. I quite like that (it also means that subshells from an ‘activate’d shell will have the ‘activate’ prompt). Why doesn’t anyone do it like that any more?