Archive for February, 2009
Extended pattern matching in BASH
By chys on February 26th, 2009Many features provided by BASH are not widely known or used, but they really can be useful. One example is extglob (extended pattern matching) – with this, a pattern can be almost as powerful as a regular expression.
Use “shopt -s extglob” to enable this feature. After that, in addition to the standard asterisks, question marks and square brackets, we can also use the following five sub-patterns:
?(pattern-list): Matches empty or one of the patterns
*(pattern-list): Matches empty or any number of occurrences of the patterns
+(pattern-list): Matches at least one occurrences of the patterns
@(pattern-list): Matches exactly one of the patterns
!(pattern-list): Matches anything EXCEPT any of the patterns
The pattern-list represents one or more patterns, which can again contain these extended sub-patterns, delimited by pipe signs (|). Two simple examples:
rm -rf !(lost+found)
Removes everything exceptlost+foundfor x in *.@(jp?(e)g|gif|png)
Loops through all files having extensionjpg,jpeg,gif, orpng
The following example is a little more complicated. It prints a list of default/GNU/Intel C/C++ compilers present in directories specified by $PATH:
#!/bin/bash
shopt -s extglob nullglob
x="${PATH//:/,}"
eval "printf '%s\n' {$x}/@(?([ig])cc|[cg]++|icpc)?(-+([0-9])+(\.+([0-9])))"
(NOTE: nullglob makes a pattern matching no file to expand to nothing instead of unchanged.)
Doesn’t it look like a regular expression? The output is like this in my system:
/usr/bin/c++ /usr/bin/c++-4.2.4 /usr/bin/c++-4.3.3 /usr/bin/cc /usr/bin/g++ /usr/bin/g++-4.2.4 /usr/bin/g++-4.3.3 /usr/bin/gcc /usr/bin/gcc-4.2.4 /usr/bin/gcc-4.3.3 /usr/x86_64-pc-linux-gnu/gcc-bin/4.2.4/c++ /usr/x86_64-pc-linux-gnu/gcc-bin/4.2.4/g++ /usr/x86_64-pc-linux-gnu/gcc-bin/4.2.4/gcc /opt/intel/cce/10.1.018/bin/icc /opt/intel/cce/10.1.018/bin/icpc
Unfortunately, we cannot use the following codes:
x="${PATH//:/|}"
printf '%s\n' $x/@(?([ig])cc|[cg]++|icpc)?(-+([0-9])+(\.+([0-9])))
This is not surprising, however. No sub-patterns is allowed to expand to a string including forward slashes (path delimiter)[1]. (This means a single asterisk won’t expand to a file in a subdirectory, which is usually desired. Bash 4 has introduced ** which matches slashes as well.)
[1] In a case statement or a [[ ]] builtin (using the == operator), sub-patterns indeed match slashes.
VIM modelines
By chys on February 22nd, 2009I was wondering why all Emacs modeline samples I found online worked so well, but no VIM ones worked at all. In fact, VIM by default disables modelines because it is a potential security hole. (This is too bad. They can enable only safe instructions by default.) Add set modeline to ~/.vimrc to enable it.
A modeline is some instructions to the editor put in the comments, usually in the beginning of a text file. The most common use is probably to set the syntax highlighting rule, tab width and indent.
Modeline example:
/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* vim:set ft=cpp ts=2 sw=2 sts=2 cindent: */
The first line is for Emacs and the second for VIM. Note that an Emacs modeline must appear in the beginning of a file, while VIM modelines can be put anywhere.
Basicly, these two lines force the file type to be C++ (especially useful for .h files, which are by default treated as C sources), set the width of a tab to be 2 characters and enable auto indention. (My knowledge about Emacs is very limited, so probably this is not accurate.)
Some resources:
1. GNU Emacs modeline documentation
2. VIM modeline documentation
Unix time to be 1234567890 Friday
By chys on February 8th, 2009The Unix time is going to be 1234567890 this Friday or Saturday, depending on your timezone. In UTC (GMT), the time is Fri Feb 13 23:31:30 2009. This is a time of celebration for geeks!
For most people in the Western Hemisphere, this time falls on a Friday the 13th. We have three Fridays the 13th this year…
For people in the Eastern Hemisphere, it falls on this year’s St. Valentine’s Day.
The Unix time, or POSIX time, equals the number of seconds elapsed since the ‘Unix epoch’ (Jan 1 00:00:00 1970 UTC), ignoring leap seconds. The type time_t in all Unices and Unix-likes, and many non-Unix systems including Microsoft Windows, stores Unix time.
Certainly our systems will not have any timing problem this Friday the 13th. We’re really going to have problems at Tue Jan 19 03:14:08 2038, when the Unix time overflows 32-bit signed integer. Hopefully nobody will still be using 32-bit systems. If represented in 64-bit signed integer, Unix time won’t overflow in 292.5 millionbillion years.
