Archive for the 'shell' Category

Making change with shell

2012-03-13

I was flicking through Wikström’s «Functional Programming Using Standard ML», when I noticed he describes the problem of making up change for an amount of money m using coins of certain denominations (18.1.2, page 233). He says we “want a function change that given an amount finds the smallest number of coins that adds to that amount”, and “Obviously, you first select the biggest possible coin”. Here’s his solution in ML:

exception change;
fun change 0 cs = nul
  | change m nil = raise change
  | change m (ccs as c::cs) = if m >= c
      then c::change (m-c) ccs
      else change m cs;

It’s pretty neat. The recursion proceeds by either reducing the magnitude of the first argument (the amount we are giving change for), or reducing the size of the list that is the second argument (the denominations of the coins we can use); so we can tell that the recursion must terminate. Yay.

It’s not right though. Well, it gives correct change, but it doesn’t necessarily find the solution with fewest number of coins. Actually, it depends on the denominations of coins in our currency; probably for real currencies the “biggest coin first” algorithm does in fact give the fewest number of coins, but consider the currency used on the island of san side-effect, the lambda. lambdas come in coins of Λ1, Λ10, and Λ25 (that’s not a wedge, it’s a capital lambda. It’s definitely not a fake A).

How do we give change of Λ30? { Λ10, Λ10, Λ10 } (3 tens); what does Wikström’s algorithm give? 1 twenty-five and 5 ones. Oops.

I didn’t work out a witty solution to the fewest number of coins change, but I did create, in shell, a function that lists all the possible ways of making change. Apart from trivial syntactic changes, it’s not so different from the ML:

_change () {
    # $1 is the amount to be changed;
    # $2 is the largest coin;
    # $3 is a comma separated list of the remaining coins

    if [ "$1" -eq 0 ] ; then
        echo ; return
    fi
    if [ -z "$2" ] ; then
        return
    fi
    _change $1 ${3%%,*} ${3#*,}
    if [ "$1" -lt "$2" ] ; then
        return
    fi
    _change $(($1-$2)) $2 $3 |
        while read a ; do
            echo $2 $a
        done
}

change () {
    _change $1 ${2%%,*} ${2#*,},
}

Each solution is output as a single line with the coins used in a space separated list. change is a wrapper around _change which does the actual work. The two base cases are basically identical: «”$1″ -eq 0» is when we have zero change to give, and we output an empty line (just a bare echo) which is our representation for the empty list; «-z “$2″» is when the second argument (the first element of the list of coins) is empty, and, instead of raising an exception, we simply return without output any list at all.

The algorithm to generate all possible combinations of change is only very slightly different from Wikström’s: if we can use the largest coin, then we generate change both without using the largest coin (first recursive call to _change, on line 12) and using the largest coin (second recursive call to _change, on line 16). See how we use a while loop to prepend (cons, if you will) the largest coin value to each list returned by the second recursive call. Of course, when the largest coin is too large then we proceed without it, and we only have the first recursive call.

The list of coins is managed as two function arguments. $2 is the largest coin, $3 is a comma separated list of the remaining coins (including a trailing comma, added by the wrapping change function). See how, in the first recursive call to _change $3 is decomposed into a head and tail with ${3%%,*} and ${3#*,}. As hinted at in the previous article, «%%» is a greedy match and removes the largest suffix that matches the pattern «,*» which is everything from the first comma to the end of the string, and so it leaves the first number in the comma separated list. «#» is a non-greedy match and removes the smallest prefix that matches the pattern «*,», so it removes the first number and its comma from the list. Note how I am assuming that all the arguments do not contain spaces, so I am being very cavalier with double quotes around my $1 and $2 and so on.

It even works:

$ change 30 25,10,1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
10 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
10 10 1 1 1 1 1 1 1 1 1 1
10 10 10
25 1 1 1 1 1

Taking the bash out of Mark

2012-03-05

Mark Dominus, in his pretty amusing article about exact rational arithmetic in shell gives us this little (and commented!) shell function:

        # given an input number which might be a decimal, convert it to
        # a rational number; set n and d to its numerator and
        # denominator.  For example, 3.3 becomes n=33 and d=10;
        # 17 becomes n=17 and d=1.
        to_rational() {
          # Crapulent bash can't handle decimal numbers, so we will convert
          # the input number to a rational
          if [[ $1 =~ (.*)\.(.*) ]] ; then
              i_part=${BASH_REMATCH[1]}
              f_part=${BASH_REMATCH[2]}
              n="$i_part$f_part";
              d=$(( 10 ** ${#f_part} ))
          else
              n=$1
              d=1
          fi
        }

Since I’m on a Korn overdrive, what would this look like without the bashisms? Dominus uses BASH_REMATCH to split a decimal fraction at the decimal point, thus splitting ‘fff.iii’ into ‘fff’ and ‘iii’. That can be done using portable shell syntax (that is, blessed by the Single Unix Specification) using the ‘%’ and ‘#’ features of parameter expansion. Example:

$ f=3.142
$ echo ${f%.*}
3
$ echo ${f#*.}
142

In shell, «${f}» is the value of the variable (parameter) f; you probably knew that. «${f%pattern}» removes any final part of f that matches pattern (which is a shell pattern, not a regular expression). «${f#pattern}» removes any initial part of f that matches pattern (full technical details: they remove the shortest match; use %% and ## for greedy versions).

Thus, between them «${f%.*}» and «${f#*.}» are the integer part and fractional part (respectively) of the decimal fraction. The only problem is when the number has no decimal point. Well, Dominus special cased that too. Of course the “=~” operator is a bashism (did perl inspire bash, or the other way around?), so portable shell programmers have to use ‘case’ (which traditionally was always preferred even when ‘[‘ could be used because ‘case’ didn’t fork another process). At least this version features a secret owl hidden away (on line 3):

to_rational () {
  case $1 in
    (*.*) i_part=${1%.*} f_part=${1#*.}
      n="$i_part$f_part"
      d=$(( 10 ** ${#f_part} )) ;;
    (*) n=$1 d=1 ;;
  esac
}

The ‘**’ in the arithmetic expression raised a doubt in my mind and, *sigh*, it turns out that it’s not portable either (it does work in ‘ksh’, but it’s not in the Single Unix Specification). Purists have to use a while loop to add a ’0′ digit for every digit removed from f_part:

to_rational () {
  case $1 in
    (*.*) i_part=${1%.*} f_part=${1#*.}
      n="$i_part$f_part"
      d=1;
      while [ -n "${f_part}" ] ; do
          d=${d}0
          f_part=${f_part%?}
      done ;;
    (*) n=$1 d=1 ;;
  esac
}

Traditional shell didn’t support this «${f%.*}» stuff, but it’s been in Single Unix Specification for ages. It’s been difficult to find a Unix with a shell that didn’t support this syntax since about the year 2000. It’s time to start to be okay about using it.

Interactive Shells and their prompts

2012-03-04

What can we say about what the “-i” option to shell does? It varies according to the shell. Below, I start with PS1 set to “demo$ “, which is not my default. (you can probably work it out from the transcript, but) it might help to know that my default shell is bash (for now). There’s nothing special about ‘pwd’ in the examples below, it’s just a command with a short name that outputs something.

demo$ echo pwd | ksh
/home/drj/hackdexy/content
demo$ echo pwd | bash
/home/drj/hackdexy/content
demo$ echo pwd | ksh -i
$ /home/drj/hackdexy/content
$ 
demo$ echo pwd | bash -i
drj$ pwd
/home/drj/hackdexy/content
drj$ exit
demo$ echo pwd | bash --norc -i
bash-4.2$ pwd
/home/drj/hackdexy/content
bash-4.2$ exit
demo$ echo pwd | sh -i
$ /home/drj/hackdexy/content
$ 
sh: Cannot set tty process group (No such process)
demo$ . ../bin/activate
(hackdexy)demo$ echo pwd | ksh -i
(hackdexy)demo$ /home/drj/hackdexy/content
(hackdexy)demo$ 
(hackdexy)demo$ deactivate

What have we learnt? (when stdin is not a terminal) shells do not issue prompts unless the “-i” option is used. That’s basically what “-i” does. bash, but not ksh, will echo the command. That has the effect of making bash’s output seem more like an interactive terminal session.

Both ‘bash’ and ‘ksh’ changed the prompt. ‘bash’ changed the prompt because it sources my ‘.bashrc’ and that sets PS1; ‘ksh’ changed my prompt, apparently because PS1 is not an exported shell variable, and so ‘ksh’ does not inherit PS1 from its parent process, and so sets it to the default of “$ “. If we stop ‘bash’ from sourcing ‘.bashrc’ by using the ‘–norc’ option (‘–posix’ will do as well) then it too will use its default prompt: the narcissistic “bash-4.2$ “.

‘sh’ (dash on my Ubuntu laptop, apparently), is like ‘ksh’ in that it does not echo the commands. But it does output a narked warning message.

The last command, after using ‘activate’ from virtualenv, demonstrates that ‘activate’ must export PS1. I think that is a bug, but I would welcome comment on this matter.

I guess most people use ‘bash’ and mostly set PS1 their ‘.bashrc’, so wouldn’t notice any of this subtlety (for example, they will not notice that ‘activate’ exports PS1 because the ‘.bashrc’ will set it to something different).

I note that the example login and ENV scripts given in The KornShell Command and Programming Language (p241 1989 edition) set PS1 is the login script only, and do not export it. Meaning that subshells will have the default prompt. I quite like that (it also means that subshells from an ‘activate’d shell will have the ‘activate’ prompt). Why doesn’t anyone do it like that any more?

Unix command line pebbles

2012-01-13

I always find that “sitting next to someone new” is an interesting and revealing experience. I learn new things, they learn new things. Over the years I’ve accumulated a bunch of tricks about using the command line, using vi, and even using emacs. The thing I’m talking about are things can’t really be learned in a formal context, they are too small and there are too many of them. Yet collectively they form part of the fabric of what it means to be a vi user or a shell programmer or whatever. Picking them up and passing them on by sitting next to someone is the only way I know to spread them. Well, maybe this blog post is a start.

Here’s a few I’ve recently picked up or passed on:

cd -

It changes directory to the last directory that you changed from. It’s /bin/bash not /bin/sh (so I personally would avoid writing it in scripts). I think someone alleged that this was undocumented, but in fact I recently checked and it is in fact documented in the builtins section of the /bin/bash man page, which is huge.

It’s useful when you want to cd to some directory to run a command, but then cd back.

Line 25 of the runlocal script in ScraperWiki used to do just that in order to start a service from within a different current directory:

cd ../services/scriptmgr
node ./scriptmgr.js &
cd -

Because I don’t like writing bash-specific scripts, I changed this to use a subshell, with round brackets:

(
    cd ../services/scriptmgr
    node ./scriptmgr.js &
)

A subshell is used to run the code between the round brackets, and it is just a forked copy of the shell. It has exactly the same environment (environment variables, open files, and so on), but is in another process. Since the Current Working Directory is specific to the process, changing it in the subshell has no effect on the outer shell, which carries on unaffected after the subshell has returned. So that’s something I’ve passed on recently.

cp foo.cnf{-inhg,}

This is the same as cp foo.cnf-inhg foo.cnf. It copies a file by removing the -inhg extension from its name. ScraperWiki stores some config files in Mercurial (the -inhg versions), and they are copied so they can be edited locally (the versions without the -inhg suffix). I’m sure this has all sorts of evil uses.

sudo !!

Of the things I’ve picked up recently, this is my favourite. I like how some of these tips are a combination of things that I already knew individually but hadn’t thought to combine. Of course I know that sudo thing runs thing as root, and I did once know that !! is the /bin/csh mechanism (stolen by /bin/bash) for executing the previously typed command. In combination, sudo !! runs the previously typed command as root. Which is perfect for when you typed pip install thinginator instead of sudo pip install thinginator.

Ctrl-R

The favourite thing I seem to have spread seems to be Ctrl-R. (in /bin/bash) you press Ctrl-R, start typing, and bash will search for what you type in your command history; press Return to execute the command or Ctrl-R to find the next older match. Press Ctrl-C if you didn’t really want to do that.

Credits

One of the @pysheff crowd for sudo !!; can’t remember whether it was @davbo or @dchetwynd.

Ross Jones for cd -;

Bitbucket user mammadori “{-ext,}”.

If you enjoyed this post, you may also like Mark Dominus’ semi-rant on the craptastic way to do calculation in shell scripts. Huh, now I feel compelled to that Mark gets it wrong about GNU Shell; arithmetic expansion is not a GNU innovation, it comes from POSIX, having been slightly mutated from an earlier Korn Shell feature. And if he had comments I would’ve said that on his “blag”.

Awesome Shell Power: stupid z trick

2007-07-17

Say you want to move all the files in the current directory down into a new directory called “awesome”.

Instead of:

mkdir awesome
mv * awesome

which is likely to complain: “mv: rename awesome to awesome/awesome: Invalid argument” (though it does in fact work), you might try:

mkdir z
mv *
mv z awesome

This “works” when the newly created directory, called “z”, comes at the end of the list that “*” expands to. You weren’t in the habit of creating files beginning with z were you?

Awesome Shell Power: yes ” | head | nl -ba

2007-05-03

Often when programming in shell you want a sequence of numbers. bash has quirky for syntax for doing it, but I’m not really interested in that. I want something portable.

«yes '' | head | nl -ba» will generate a sequence of numbers from 1 to 10. 1 to N can be achieved by specifying an argument to head: «yes '' | head -7 | nl -ba», or, as I never tire of telling people, using sed instead of head: «yes '' | sed 7q | nl -ba». The sed version is one character shorter.

As long as we’ve invoked sed we may as well read the manual to get: «yes | sed -n '=;7q'». Note that as sed is not actually processing the input lines, just counting them, it doesn’t matter what their contents are, so we no longer need an argument to yes.

It turns out that yes is not a standard part of the Unix utilities. So much for portability. Whilst we could make an adequate simulation of yes with «while : ; do echo y ; done», it’s much more amusing to use dd:

$ dd < /dev/zero ibs=1 cbs=1 count=7 conv=unblock | sed -n =
7+0 records in
0+1 records out
1
14 bytes transferred in 0.000060 secs (233945 bytes/sec)
2
3
4
5
6
7

I love dd for its obscurity, its parody of the DD statement in JCL, and the way no-one seems to know what it is for any more. There are some stderr turds left by dd which we can eliminate with a quick 2>&-: «dd < /dev/zero 2>&- ibs=1 cbs=1 count=7 conv=unblock | sed -n =».

This version of the command is improved in many ways: “count=7″ is practically self-documenting; we no longer need quotes around the sed program (though the way a bare = appears at the end does seem like a bit of a non sequitur, especially given all the other = that appear in the dd command that have a completely different meaning). However, it’s unsatisfying in many other ways. It uses /dev/zero which isn’t guaranteed to exist (even on conformant Unix OSes). It squirts binary data (a sequence of alternating NUL and LF characters) into sed which isn’t obliged to behave on such input. It’s long and silly.

Much better to just generate the sequence directly using something approaching a real programming language:

awk 'BEGIN{ for(i=1;i<=7;++i) print i}'

Even those who don’t know the mighty awk can probably work out what it does. It’s portable (some now truly ancient, that is SunOS 4.x, systems might need an extra < /dev/null); it’s clear; it works.

Of course on systems with jot installed you can just go jot 7. That includes FreeBSD and OS X. It turns out that jot is named after APL’s ɩ, iota, which is spelt i. in J.

Awesome Shell Power: 2>&1 | tee …

2007-04-20

Collect the stdout and stderr of a command into a single file and see the output interactively:

some command 2>&1 | tee err

I use this with make so often that I have a script called mk:

make "$@" 2>&1 | tee err

(and don’t forget, we always use "$@", never $*)

The output file is named err but would probably be better named ofile (suggested by Nick B in comments). I use err in deference to 15 years of my personal Unix tradition.

The Order of Redirections

Now let’s take pipe out of the equation for a moment and consider redirections alone. On my Mac OS X laptop the command ls -d /eta /etc produces the following output:

ls: /eta: No such file or directory
/etc

The “No such file or directory” message appears on stderr and “/etc” appears on stdout.

You can prove that to yourself by closing one or the other file descriptor. Play around with ls -d /eta /etc >&-, and ls -d /eta /etc 2>&- and convince yourself that this is so.

Shell redirections are processed from left to right; we can get the stderr of a command to appear on its stdout and get rid of its stdout entirely:

$ ls -d /eta /etc 2>&1 1>&-
ls: /eta: No such file or directory

You might like to think about why this works. fd 2 is redirected to fd 1, meaning output to fd 2 goes to the same place as fd 1 at the time that the redirection was made. Then fd 1 is closed; this doesn’t affect the previous redirection of fd 2.

Doing it the other way round, we get this:

$ ls -d /eta /etc 1>&- 2>&1
-bash: 1: Bad file descriptor

That’s because by the time that it comes to redirect fd 2 to fd 1, fd 1 has been closed and can’t be used for a redirect.

How can we tell that 2>&1 1>&- sends what normally appears on stderr to stdout? Does this convince you:

$ ( ls -d /eta /etc 2>&1 1>&- ) | sed 's/^/*** /'
*** ls: /eta: No such file or directory

By borrowing an extra fd you can even swap stderr and stdout over:

$ ls -d /eta /etc 3>&2 2>&1 1>&3 3>&- | sed 's/^/*** /'
/etc
*** ls: /eta: No such file or directory

Notice that the error message from ls has gone through the pipe and been transformed by sed, so it must have been sent to stdout. The normal output of ls, “/etc”, has not gone through the pipe, so must’ve been sent to stderr.

Awesome Shell Power: */.

2007-04-17

I probably know lots of tips and tricks about using /bin/sh that are not widely enough known about.

Today’s shell tip: */.

It’s a shell pattern that expands to all the sub-directories of the current directory.

If you want to list all the names of the sub-directories in the current directory it’s a lot easier to type echo */. than it is to type find . -name . -o -type d -print -prune which is almost equivalent.

ls -ld */.

Follow

Get every new post delivered to your Inbox.