Posts Tagged ‘shell’

globstar in bash 4 follows directory symlinks

Globstar is a new feature is bash 4, allowing us to traverse a directory more easily.

Unfortunately, it follows directory symlinks and thus can easily cause problems.

(bleeding) desktop t # echo ${BASH_VERSINFO[@]}
4 0 17 2 release x86_64-pc-linux-gnu
(bleeding) desktop t # shopt -s globstar
(bleeding) desktop t # ls -l
total 0
lrwxrwxrwx 1 root root 1 2009-04-16 18:58 t -> .
(bleeding) desktop t # find
.
./t
(bleeding) desktop t # echo **
t t/t t/t/t t/t/t/t t/t/t/t/t t/t/t/t/t/t t/t/t/t/t/t/t t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t/t
(bleeding) desktop t #

Oh no…

If you unfortunately tried something like echo /proc/**/meminfo, it probably would make you wait for minutes before dying with “Insufficient memory.” (In /proc/fd there resides a root symlink.)

If you use GRUB to boot your Linux system, you are likely to find a symlink in /boot also named boot pointing to the directory itself. Yes, this is going to confuse bash, too. And there surely are many more cases.

So let’s continue writing find ... | xargs ...

Tags: ,

Extended pattern matching in BASH

Many features provided by BASH are not widely known or used, but they really can be useful. One example is extglob (extended pattern matching) – with this, a pattern can be almost as powerful as a regular expression.

Use “shopt -s extglob” to enable this feature. After that, in addition to the standard asterisks, question marks and square brackets, we can also use the following five sub-patterns:

?(pattern-list): Matches empty or one of the patterns
*(pattern-list): Matches empty or any number of occurrences of the patterns
+(pattern-list): Matches at least one occurrences of the patterns
@(pattern-list): Matches exactly one of the patterns
!(pattern-list): Matches anything EXCEPT any of the patterns

The pattern-list represents one or more patterns, which can again contain these extended sub-patterns, delimited by pipe signs (|). Two simple examples:

  • rm -rf !(lost+found)
    Removes everything except lost+found
  • for x in *.@(jp?(e)g|gif|png)
    Loops through all files having extension jpg, jpeg, gif, or png

The following example is a little more complicated. It prints a list of default/GNU/Intel C/C++ compilers present in directories specified by $PATH:

#!/bin/bash
shopt -s extglob nullglob
x="${PATH//:/,}"
eval "printf '%s\n' {$x}/@(?([ig])cc|[cg]++|icpc)?(-+([0-9])+(\.+([0-9])))"

(NOTE: nullglob makes a pattern matching no file to expand to nothing instead of unchanged.)

Doesn’t it look like a regular expression? The output is like this in my system:

/usr/bin/c++
/usr/bin/c++-4.2.4
/usr/bin/c++-4.3.3
/usr/bin/cc
/usr/bin/g++
/usr/bin/g++-4.2.4
/usr/bin/g++-4.3.3
/usr/bin/gcc
/usr/bin/gcc-4.2.4
/usr/bin/gcc-4.3.3
/usr/x86_64-pc-linux-gnu/gcc-bin/4.2.4/c++
/usr/x86_64-pc-linux-gnu/gcc-bin/4.2.4/g++
/usr/x86_64-pc-linux-gnu/gcc-bin/4.2.4/gcc
/opt/intel/cce/10.1.018/bin/icc
/opt/intel/cce/10.1.018/bin/icpc

Unfortunately, we cannot use the following codes:

x="${PATH//:/|}"
printf '%s\n' $x/@(?([ig])cc|[cg]++|icpc)?(-+([0-9])+(\.+([0-9])))

This is not surprising, however. No sub-patterns is allowed to expand to a string including forward slashes (path delimiter)[1]. (This means a single asterisk won’t expand to a file in a subdirectory, which is usually desired. Bash 4 has introduced ** which matches slashes as well.)

[1] In a case statement or a [[ ]] builtin (using the == operator), sub-patterns indeed match slashes.

Tags: ,

Using exec /bin/bash in .cshrc

I don’t like tcsh, which is the login shell here. I cannot make bash the default with ypchsh – the administrator does not allow this change. I’m also tired of typing exec bash every time, so I have to experiment with adding exec /bin/bash to ~/.cshrc. This is the code I’m currently using which seems to be working well:

if ($?prompt && "$0" =~ -* ) then
  exec /bin/bash
endif

The first condition $?prompt determines whether this is an interactive shell. This is necessary, otherwise various utilities including scp can fail in strange ways. (scp needs to execute commands using the login shell remotely.)

The second one, "$0" =~ -* excludes non-login instances. So I can type tcsh to explicitly switch to tcsh. This is useful if someone more familiar with the C shell is temporarily using my account.

This probably is not the best way to determine whether the shell is a login, but well, it works in all cases I can imagine… (Similarly, we can use [[ "$0" == -* ]] in bash, but it seems people prefer [[ $- == *i* ]].)

Tags: , ,

BASH’s ‘read’ built-in supports ‘\0′ as delimiter

I thought it was impossible to use ‘’ as a delimiter in bash, but noticed yesterday that Gentoo’s ebuild.sh had pipelines like this:

find ….. -print0 |
while read -r -d $’’ x; do
# Do something with file $x
done

This makes it possible to handle any strange filenames correctly, even if the filename contains newline ('n') or carriage return ('r') characters. (Some other commands, including sort and xargs, have options to make null character the delimiter based on the same reason.)

Because BASH internally uses C-style strings, in which '' is the terminator, read -d $'' is essentially equivalent to read -d ''. This is why I believed read did not accept null-delimited strings. However, it turns out that BASH actually handles this correctly.

I checked BASH’s souce code and found the delimiter was simply determined by delim = *list_optarg; (bash-3.2/builtins/read.def, line 296) where list_optarg points to the argument following -d. Therefore, it makes no difference to the value of delim whether $'' or '' is used.

Tags: ,

Jumbled Characters after Catting a Binary File

When this happens, simply press Ctrl-V Ctrl-O Ctrl-M. Or alternatively, type “reset” and Return (Enter).

A terminal interpretes 0×0e byte as “activates the G1 character set”, and 0×0f as “activates the G0 character set”. The characters we read are in the G0 set. So, if there is no byte 0×0f after the last 0×0e in a binary file, everything will be shown in the unreadable G1 set, including the next shell prompt.

How does Ctrl-V Ctrl-O Ctrl-M work?
Ctrl-V is an ‘escape character’ – the next keystroke will always be interpreted as a literal character; Ctrl-O is 0×0f; Ctrl-M is carriage return. So the shell gets the command “x0f” and outputs the error message “bash: x0f: command not found”. The byte 0×0f in this message turns the active character back to the readable G0.

G1 character set is not often used these days. Konsole chooses not to implement it at all, so we never have this problem in Konsole.

Tags: , ,

Department Computers

The department computer and network administrators are completely of the opposite kind of people from me. They installed everything I dislike, and almost nothing I like…

  • I dislike Ubuntu, but most of the so-called “Unix” machines are Ubuntu. (The rest are Solaris.)
  • I like bash (as most Unix-like users do these days, I bet…), but their default shell is tcsh. Perhaps this is a convention inherited from antediluvian days.
  • They have GNOME, Xfce, fvwm, Sawfish and Fluxbox installed. The only major DE/WM missing is KDE, which is my favorite.

The first time I tried to print something, it was sent to a printer in a lab 50 yards away, despite the fact that there was one only 2 yards from me. I needed to set the PRINTER environment variable so that lpr knew I wanted to use the printer in my office.

If I were using KDE, I would have finished this in 20 seconds, by catting a 2-line file to ~/.kde/env. But it took me 20 minutes to find its counterpart in GNOME, and then another 5 minutes to find out whether I should write the script ~/.gnomerc in Bourne Shell grammar or C Shell grammar. The answer is Bourne Shell grammar. Though the default shell is tcsh, ~/.gnomerc is always interpreted by /bin/sh, which is a symlink to dash (not bash) in Ubuntu and latest version of Debian… Humph…..

Tags: , , , ,