Tuesday's Tips

Welcome to Tuesday's Tips for shell scripting.

These tips come from my own scripting as well as answers I have provided to queries in various usenet newgroups (e.g., comp.unix.shell and comp.os.linux.misc).

The series ran from April to September, 2004, at which time I began work on a book of shell scripts. Due to the demands that project made on my time, I was unable to continue the series.

  1. The Easy PATH (28 Sep 2004)
  2. flocate — locate a file (21 Sep 2004)
  3. Toggling a variable (31 Aug 2004)
  4. Redirecting stdout and stderr (24 Aug 2004)
  5. A list of directories (17 Aug 2004)
  6. Learning to read (10 Aug 2004)
  7. New Bash Parameter expansion ( 3 Aug 2004)
  8. Heads up (27 Jul 2004)
  9. Random thoughts (20 Jul 2004)
  10. Useful variables (13 Jul 2004)
  11. Centering text on a line ( 6 July 2004)
  12. Setting multiple variables with one command (29 June 2004)
  13. Adding numbers from a file (22 June 2004)
  14. How many days in the month? (15 June 2004)
  15. Is this a leap year? ( 8 June 2004:
  16. Searching man pages with sman() ( 1 June 2004)
  17. Removing non-consecutive duplicate lines from a file (25 May 2004)
  18. Printing to entire width of screen (18 May 2004)
  19. Extracting multiple values from a string ( 4 May 2004)
  20. Automatic array indexing (27 April 2004)

28 Sep 2004:  The Easy PATH

To format one's PATH variable for easy viewing, try this function:

path()
{
    oldIFS=$IFS
    IFS=:
    printf "%s\n" $PATH
    IFS=$oldIFS
}   

A typical run of the function:

$ path
/bin
/usr/bin
/usr/bin/X11
/usr/X11R6/bin
/usr/local/bin
/home/chris/bin
/usr/games
/home/chris/scripts  

21 Sep 2004:  flocate — locate a file

Most recent systems have a locate command (which may be a link to slocate). It uses a database to look up file names, but the search is done on the entire path to the file. In other words, locate qwe finds a lot of files in the /usr/lib/kbd/keymaps/ directory as well the /home/chris/qwe file.

To look up a file name with locate, you need to use a wild card:

locate "*/qwe"
   

I have this in a command called flocate, which will look up multiple files given on the command line:

for p
do
  locate "*/$p"
done
   

31 Aug 2004:  Toggling a variable

To toggle a variable between two values, I use this var_toggle function:

var_toggle()
{
    eval "_VAR_TOGGLE=\$$1"
    [ ${_VAR_TOGGLE:-0} = ${3:-0} ] &&
                 _VAR_TOGGLE=${2:-1} ||
		      _VAR_TOGGLE=${3:-0}
    eval "$1=\$_VAR_TOGGLE"
}
   

The first argument is the name of the variable to be toggled. Successive calls to var_toggle alternate the value of the variable between two values.

If no other arguments are given, the variable is toggled between 1 and 0.

$ var=1
$ var_toggle var; echo $var
0
$ var_toggle var; echo $var
1
$ var_toggle var; echo $var
0
$ var_toggle var; echo $var
1

If one other argument is given, the values alternates between that value and 0.

$ var_toggle var 13; echo $var
0
$ var_toggle var 13; echo $var
13
$ var_toggle var 13; echo $var
0
$ var_toggle var 13; echo $var
13
   

If two more arguments are given, the value is toggled between those two.

$ var_toggle var 13 5; echo $var
5
$ var_toggle var 13 5; echo $var
13
$ var_toggle var 13 5; echo $var
5
$ var_toggle var 13 5; echo $var
13
   

24 Aug 2004:  Redirecting stdout and stderr

By default, each Unix process has 3 file descriptors (FDs) assigned to it, 0, 1 and 2; these are known as stdin, stdout, and stderr respectively. They are normally connected to your terminal: stdin is the keyboard; stdout and stderr are your screen.

Each of these represents a stream that can be redirected to other places, such as files or pipes. The output streams, stdout and stderr can be combined and sent to the same place, or directed to different locations.

If you redirect stdout (FD1) to a file, stderr (FD2) will still go to your screen.

$ ls -ld /home /qwerty
ls: /qwerty: No such file or directory
drwxr-xr-x  11 root root 4096 Jul 11 02:58 /home
   

If we redirect stdout to the bit bucket, the stderr will still be sent to the screen:

$ ls -ld /home /qwerty 1>/dev/null
ls: /qwerty: No such file or directory
   

(NOTE: the 1 can be left off; >xxxx implies 1>xxxx.)

Similarly, if we redirect stderr to the bit bucket, the stdout will still be sent to the screen:

$ ls -ld /home /qwerty 2>/dev/null
drwxr-xr-x  11 root root 4096 Jul 11 02:58 /home
   

Or we can redirect both stderr and stdout to the bit bucket:

$ ls -ld /home /qwerty >/dev/null 2>/dev/null
   

This works with /dev/null, which isn't a real file, but redirecting both stdout and stderr to the same file individually will not work, as both redirections truncate the file:

$ ls -ld /home /qwerty >/tmp/xxx 2>/tmp/xxx
   

(The full explanation is too technical for this tip, but it is akin to redirecting output to the same file as the input.)

To redirect both stdout and stderr to the same file, we redirect one stream to the file, then redirect the other to the first stream:

$ ls -ld /home /qwerty >/tmp/xxx 2>&1
   

The order of the redirections is important. This will send stderr to the terminal and stdout to the file /tmp/xxx:

$ ls -ld /home /qwerty 2>&1 >/tmp/xxx
   

Sending stderr to stdout attaches the stream to wherever stdout is pointing at the time of the redirection. It is as if stdout and stderr are variables; you are doing the equivalent of:

## the defaults
stdout=screen
stderr=screen

## redirect stderr
stderr=$stdout

## redirect stdout
stdout=/dev/null
   

Now, stdout=/dev/null, and stderr=screen.

If you change the order, the result is different:

## the defaults
stdout=screen
stderr=screen

## redirect stdout
stdout=/dev/null

## redirect stderr
stderr=$stdout
   

Now, stdout and stderr are both pointing to /dev/null.

17 Aug 2004:  A list of directories

To see a list of subdirectories of the current directory:

printf "%s\n" */
   

With a Bourne shell:

echo */.
   

10 Aug 2004:  Learning to read

Taking the Heads Up tip from 2 weeks ago a little further, let's look at some more things the read command can do. The basic read command does more than just read a line of input. It strips leading and trailing whitespace, and it processes escape sequences introduced by a backslash.

The primary function of this is to allow lines to be continued by ending a line with a backslash. If a file contains:

This is the first line \
this is a continuation \
and so is this
   

Using read, all three lines will be concatenated into a single line with one command:

$ read x < $HOME/txt
$ echo "$x"
This is the first line this is a continuation and so is this
   

To prevent backslashes being interpreted, raw mode (-r) is used:

$ read -r x < $HOME/txt
$ echo "$x"
This is the first line \
   

If more than one variable is given as an argument to read, the line will be broken up (using the characters in IFS as separators) and assigned to each variable in turn. If there are more words than variables, the last variable will contain the remainder of the line:

$ read -r a b c d  < $HOME/txt
$ printf "%s\n" "a=$a" "b=$b" "c=$c" "d=$d"
a=This
b=is
c=the
d=first line \
   

Or, without using raw mode:

$ read  a b c d  < $HOME/txt
$  printf "%s\n" "a=$a" "b=$b" "c=$c" "d=$d"
a=This
b=is
c=the
d=first line     this is a continuation          and so is this
   

3 Aug 2004:  New Bash Parameter expansion

The recently released Bash 3.0 has added a parameter expansion function that can replace, and do more than, the external command, seq, found on GNU/Linux systems.

In version 3 of Bash, brace expansion can be used to expand a range of numbers — or letters.

$ echo {1..13}
1 2 3 4 5 6 7 8 9 10 11 12 13
$ echo {h..o}
h i j k l m n o
   

If the first number or letter is higher than the second, the range is expanded in descending order:

$ echo {z..o}
z y x w v u t s r q p o
$ echo {9..3}
9 8 7 6 5 4 3
   

27 Jul 2004:  Heads up

A common way of getting the first line from a file:

var=`head -1 FILE`
   

But why use an external command?

If you want the first line of the file, why not just read it?

read var < FILE
   

If you want 2 lines:

{
  read var1
  read var2
} < FILE
   

20 Jul 2004:  Random thoughts

Some modern shells have a $RANDOM variable that generates a different random integer between 0 and 32767 each time it is referenced.

This can be used to select a random string from a list. The randstr function selects one of its arguments at random and puts it in the variable $_RETVAL:

randstr() {
    [ $# -eq 0 ] && return 1
    n=$(( ($RANDOM % $#) + 1 ))
    eval _RETVAL=\${$n}
}
   

For example, to pick a card at random:

randstr diamonds hearts clubs spades
suit=$_RETVAL
randstr Ace 2 3 4 5 6 7 8 9 10 Jack Queen King
card="$_RETVAL of $suit"
echo $card
   

In bash2 or ksh93, you can use an array, populated through brace expansion:

deck=( {A,2,3,4,5,6,7,8,9,10,J,Q,K}_of_{Diamonds,Hearts,Clubs,Spades} )
randstr "${deck[@]}"
echo $_RETVAL
   

You can roll dice:

randstr 1 2 3 4 5 6
echo $_RETVAL
   

...with as many sides as you like; e.g. a 12-sided die:

randstr 1 2 3 4 5 6 7 8 9 10 11 12
echo $_RETVAL
  

Of course, the dice could be more efficiently implemented with:

roll() ## USAGE: roll [N] -- N = number of sides on die; default 6
{
    roll_sides=${1:-6}
    _RETVAL=$(( $RANDOM % $roll_sides + 1 ))
}
  

13 Jul 2004:  Useful variables

I have some variables that I include in almost all my shell scripts.

I keep them in /usr/local/bin/standard-vars, and just source them at the top of each script:

. standard-vars
   

I put problematic characters, such as newline (NL) and escape (ESC) into variables:

## bash and ksh93 specific;
## in other shells, replace the $'\X' with a literal character
##            DEC   OCT   HEX
NL=$'\n'   ##  10, \012, 0x0a, a literal newline
CR=$'\r'   ##  13, \015, 0x0d, carriage return
TAB=$'\t'  ##   9, \011, 0x09, tab
ESC=$'\e'  ##  27, \033, 0x1b, escape
   

For some scripts, I would insert the above into the script itself.

The code used for those values is specific to bash and ksh93. In other shells one has to replace the escape sequence with the literal character:

NL='
'
CR='
'
   

Your web browser is probably unable to show that the character assigned to CR is a carriage return; that's one reason I like to use the $'\X' syntax in scripts posted on websites or to Usenet newsgroups.

There are more variables in my standard-vars script, mostly dealing with manipulating the terminal display. I'll look at them in the near future.

The current incarnation of the standard-vars script is at http://cfaj.freeshell.org/src/scripts/standard-vars-sh.

6 July 2004:  Centering text on a line

This function will centre text on a line of a given length:

centre() ## USAGE: centre width text...
{
   c_width=$1
   shift
   c_text="$*"
   c_width=$(( ($c_width + ${#c_text}) / 2 )) 
   printf "%${c_width}.${c_width}s\n" "$c_text"
}
   

Sample usage:

centre 45 this is centered on 45 characters
centre $COLUMNS this is centred across the entire window
   

29 June 2004:  setting multiple variables with one command

By using eval, it is possible to set more than one variable with a single command.

This is, perhaps, best illustrated by the date command, which is often used to set multiple variables, for example, YEAR, MONTH and DAY.

Too often, I see it used this way:

YEAR=`date +%Y`
MONTH=`date +%m`
DAY=`date +%d`
   

Not only can it be done with a single call to date, but multiple calls can give the wrong results, if the date crosses a boundary between the calls. This is most likely to happen when using minutes and seconds.

Doing it the correct way ensures that all the variables are set using the same date and time:

 eval "`date "+DATE=%Y-%m-%d
               YEAR=%Y
               MONTH=%m
               DAY=%d
               TIME=%H:%M:%S
               HOUR=%H
               MINUTE=%M
               SECOND=%S
               datestamp=%Y-%m-%d_%H.%M.%S
               DayOfWeek=%a
               MonthAbbrev=%b"`"
   

22 June 2004:  Adding numbers from a file

If you are using bash or ksh93, the quickest way to add numbers in a file is with parameter substitution and the shell's built-in arithmetic. The file must contain nothing but numbers. In bash the numbers must all be integers; in ksh93, they can include decimal fractions:

set -- `< $1`    ## for multiple files use: set -- `cat "$@"`
q=$*
printf "%s\n" $(( ${q// / + } ))
   

15 June 2004:  How many days in the month?

All it takes to determine how many days in any month is a simple look-up table, with a reference to last week's tip if the month is February.

days_in_month() { ## USAGE: days_in_month [month [year]]
    if [ -n "$1" ]
    then
      dim_m=$1
      dim_y=$2
    else
      eval `date "+dim_m=%m dim_y=%Y"`
    fi
    case $dim_m in
        *9|*4|*6|11)
             _DAYS_IN_MONTH=30
             ;; ## 30 days hath September...
        1|01|3|03|*5|*7|*8|10|12)
             _DAYS_IN_MONTH=31 ;;
        2|02)
             is_leap_year ${dim_y:-`date +%Y`} &&
                    _DAYS_IN_MONTH=29 ||
                      _DAYS_IN_MONTH=28 ;;
    esac
}
   

The result is stored in $_DAYS_IN_MONTH.

8 June 2004:  Is this a leap year?

Is this (or any given year) a leap year?

These two functions, is_leap_yr() and is_leap_year(), both do the job. Both use the same syntax, using the current year if one is not supplied on the command line.

The first one uses arithmetic to determine whether the year is a leap year, and will only work in a POSIX shell (such as ksh or bash). The second uses pattern matching, and will work in any Bourne-type shell.

is_leap_yr() { ## USAGE: is_leap_yr [year]
    ily_year=${1:-`date +%Y`}
    [ $(( $ily_year % 400)) -eq 0 -o \
        \( $(( $ily_year % 4)) -eq 0 -a \
        $(( $ily_year % 100)) -ne 0 \) ] && {
        _IS_LEAP_YEAR=1
        return 0
    } || {
        _IS_LEAP_YEAR=0
        return 1
    }
}
   
is_leap_year() { ## USAGE: is_leap_year [year]
    ily_year=${1:-`date +%Y`}
    case $ily_year in
        *0[48] |\
        *[2468][048] |\
        *[13579][26] |\
        *[13579][26]0|\
        *[2468][048]00 |\
        *[13579][26]00 ) _IS_LEAP_YEAR=1
                         return 0 ;;
        *) _IS_LEAP_YEAR=0
           return 1 ;;
    esac
}
   

By examining either the exit status of the function or the value of $_IS_LEAP_YEAR, one can determine whether any year (in the Gregorian calendar) is a leap year:

       year=1999
       if is_leap_year $year
       then
          echo $year is a leap year
       else
          echo $year is not a leap year
       fi
   

Or test the variable $_IS_LEAP_YEAR:

       year=1999
       is_leap_year $year
       if [ $_IS_LEAP_YEAR -eq 1 ]
       then
          echo $year is a leap year
       else
          echo $year is not a leap year
       fi
   

1 June 2004:  Searching man pages with sman()

To search a man page for a specific term, I use this function:

sman() { ## usage: sman command search_term
  PAGER=less
  export PAGER
  LESS="$LESS${2:+ +/$2}" man $1
}
   

Examples:

sman bash EXPANSION
sman find printf
sman grep "REGULAR EXPRESSIONS"
   

Subsequent occurrences of the search term can be found by pressing "n", previous ones with "N".

25 May 2004:  Removing non-consecutive duplicate lines from a file

The uniq command will "Discard all but one of successive identical lines" from a file or input stream.

In order to remove non-consecutive duplicate lines, use awk:

awk '!x[$0]++' FILE
    

18 May 2004:  Printing to entire width of screen

Sometimes one wants to truncate a line to the width of the screen or window. While it's possible to do it by actually shortening the string in a variable, the easiest way to do it is with printf.

Bash2 (and later), Korn Shell 93, and the BSD shell (ash or dash on GNU/Linux) have printf built in, and it is generally installed as a command on modern *nix systems (it's required by the POSIX standard).

long_string="Bash2 and KornShell93 have printf built in, and it is generally installed as a command on modern *nix systems (it's required by the POSIX standard)."
printf "%${COLUMNS}.${COLUMNS}s" "$long_string"

Modern shells will automatically set the COLUMNS and LINES variables. Bash has an option to do it, but not all shells update these variables size when a window's size is changed:

shopt -s checkwinsize

In other shells, tput or stty can provide the information:

set -- `stty size`
LINES=$1
COLUMNS=$2
   

Or:

LINES=`tput lines`
COLUMNS=`tput cols`
  

4 May 2004:  Extracting multiple values from a string

Far too often, I see scripts like this:

string="123,456,789"
v1=`echo $string | cut -d, -f1`
v2=`echo $string | cut -d, -f2`
v3=`echo $string | cut -d, -f3`

It uses three calls to an external command (cut) where none is necessary.

External commands are rarely needed to parse a string in a POSIX shell, and even using a Bourne shell they can often be avoided.

The shell splits strings into words using the value of the Internal Field Separator (IFS) variable as the delimiter, so the same thing can be accomplished this way:

string="123,456,789"
oldIFS=$IFS
IFS=,
set -- $string
v1=$1
v2=$2
v3=$3
IFS=$oldIFS

27 April 2004:  Automatic array indexing

The Korn Shell (ksh) and the Bourne-Again Shell (bash) from version 2 on support one-dimensional arrays which can be assigned by array[N]=something_or_other and referenced by ${array[N]}, where N is a number from 0 up (ksh has an upper limit of 4095 [recent versions of Korn Shell 93 have increased this], bash is limited only by available memory).

When adding consecutive elements to an array, there is no need to maintain an index into the array; instead of:

  for c in red green blue white
  do
    array[$n]=$c
    n=$(( $n + 1 ))
  done
   

just use:

  for c in red green blue white
  do
    array[${#array[@]}]=$c
  done
    

Since array elements start at index 0, the number of elements in the array, ${#array[@]}, is also the number of the next empty element.

More elements can be added at any time using the same syntax.

Of course, it doesn't work on a sparse array, that is, one with unset elements.

tt.shtml: last modified 29-May-2011