Posts Tagged ‘bash’

globstar in bash 4 follows directory symlinks

Globstar is a new feature is bash 4, allowing us to traverse a directory more easily.

Unfortunately, it follows directory symlinks and thus can easily cause problems.

(bleeding) desktop t # echo ${BASH_VERSINFO[@]}
4 0 17 2 release x86_64-pc-linux-gnu
(bleeding) desktop t # shopt -s globstar
(bleeding) desktop t # ls -l
total 0
lrwxrwxrwx 1 root root 1 2009-04-16 18:58 t -> .
(bleeding) desktop t # find
.
./t
(bleeding) desktop t # echo **
t t/t t/t/t t/t/t/t t/t/t/t/t t/t/t/t/t/t t/t/t/t/t/t/t t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t
(bleeding) desktop t #

Oh no…

If you unfortunately tried something like echo /proc/**/meminfo, it probably would make you wait for minutes before dying with “Insufficient memory.” (In /proc/fd there resides a root symlink.)

If you use GRUB to boot your Linux system, you are likely to find a symlink in /boot also named boot pointing to the directory itself. Yes, this is going to confuse bash, too. And there surely are many more cases.

So let’s continue writing find ... | xargs ...

Tags: ,

Extended pattern matching in BASH

Many features provided by BASH are not widely known or used, but they really can be useful. One example is extglob (extended pattern matching) – with this, a pattern can be almost as powerful as a regular expression.

Use “shopt -s extglob” to enable this feature. After that, in addition to the standard asterisks, question marks and square brackets, we can also use the following five sub-patterns:

?(pattern-list): Matches empty or one of the patterns
*(pattern-list): Matches empty or any number of occurrences of the patterns
+(pattern-list): Matches at least one occurrences of the patterns
@(pattern-list): Matches exactly one of the patterns
!(pattern-list): Matches anything EXCEPT any of the patterns

The pattern-list represents one or more patterns, which can again contain these extended sub-patterns, delimited by pipe signs (|). Two simple examples:

  • rm -rf !(lost+found)
    Removes everything except lost+found
  • for x in *.@(jp?(e)g|gif|png)
    Loops through all files having extension jpg, jpeg, gif, or png

The following example is a little more complicated. It prints a list of default/GNU/Intel C/C++ compilers present in directories specified by $PATH:

#!/bin/bash
shopt -s extglob nullglob
x="${PATH//:/,}"
eval "printf '%s\n' {$x}/@(?([ig])cc|[cg]++|icpc)?(-+([0-9])+(\.+([0-9])))"

(NOTE: nullglob makes a pattern matching no file to expand to nothing instead of unchanged.)

Doesn’t it look like a regular expression? The output is like this in my system:

/usr/bin/c++
/usr/bin/c++-4.2.4
/usr/bin/c++-4.3.3
/usr/bin/cc
/usr/bin/g++
/usr/bin/g++-4.2.4
/usr/bin/g++-4.3.3
/usr/bin/gcc
/usr/bin/gcc-4.2.4
/usr/bin/gcc-4.3.3
/usr/x86_64-pc-linux-gnu/gcc-bin/4.2.4/c++
/usr/x86_64-pc-linux-gnu/gcc-bin/4.2.4/g++
/usr/x86_64-pc-linux-gnu/gcc-bin/4.2.4/gcc
/opt/intel/cce/10.1.018/bin/icc
/opt/intel/cce/10.1.018/bin/icpc

Unfortunately, we cannot use the following codes:

x="${PATH//:/|}"
printf '%s\n' $x/@(?([ig])cc|[cg]++|icpc)?(-+([0-9])+(\.+([0-9])))

This is not surprising, however. No sub-patterns is allowed to expand to a string including forward slashes (path delimiter)[1]. (This means a single asterisk won’t expand to a file in a subdirectory, which is usually desired. Bash 4 has introduced ** which matches slashes as well.)

[1] In a case statement or a [[ ]] builtin (using the == operator), sub-patterns indeed match slashes.

Tags: ,

BASH’s compat31 option

Several of my bash scripts failed when I migrated from Debian to Gentoo almost one year ago for the different ways bash interpretes commands like this: [[ "$x" =~ '^[0-9]$' ]]. This command succeeded in Debian when $x is a single digit, but failed in Gentoo. I had to remove the single quotes surround the regular expression to make it work in Gentoo.

Today I finally found the reason: I was using bash 3.1 in Debian and 3.2 in Gentoo. Bash 3.2 by default mandates that regular expressions not be surrounded by quotes; however, the behavior can be modified using shopt -s compat31.

Tags:

How BASH Changes Terminal Window Title

In many distributions bash automatically changes the terminal title. I thought it was hardcoded in bash, but it turns out to be not. It is usually implemented with PROMPT_COMMAND.

Environment variable $PROMPT_COMMAND defines the command to be automatically executed before displaying the primary prompt (i.e. $PS1). It is set by a script that is sourced by interative instances of bash. In Gentoo it is in /etc/bash/bashrc:

case ${TERM} in
    xterm*|rxvt*|Eterm|aterm|kterm|gnome*|interix)
        PROMPT_COMMAND='echo -ne "\033]0;${USER}@${HOSTNAME%%.*}:${PWD/$HOME/~}\007"'
        ;;
    screen)
        PROMPT_COMMAND='echo -ne "\033_${USER}@${HOSTNAME%%.*}:${PWD/$HOME/~}\033\\"'
        ;;
esac

In fact, the first version above also works with newer versions of screen.

Tags: ,

BASH’s ‘read’ built-in supports ‘\0′ as delimiter

I thought it was impossible to use ‘’ as a delimiter in bash, but noticed yesterday that Gentoo’s ebuild.sh had pipelines like this:

find ….. -print0 |
while read -r -d $’’ x; do
# Do something with file $x
done

This makes it possible to handle any strange filenames correctly, even if the filename contains newline ('n') or carriage return ('r') characters. (Some other commands, including sort and xargs, have options to make null character the delimiter based on the same reason.)

Because BASH internally uses C-style strings, in which '' is the terminator, read -d $'' is essentially equivalent to read -d ''. This is why I believed read did not accept null-delimited strings. However, it turns out that BASH actually handles this correctly.

I checked BASH’s souce code and found the delimiter was simply determined by delim = *list_optarg; (bash-3.2/builtins/read.def, line 296) where list_optarg points to the argument following -d. Therefore, it makes no difference to the value of delim whether $'' or '' is used.

Tags: ,