Archive for May, 2009
C++0x in ICC 11
By chys on May 15th, 2009Intel is advancing faster than GNU in support the upcoming C++0x. The two features I expect most, auto declaration and lambda function, have both been supported satisfactorily. Lambda functions that use variables defined outside (output_and_sum in the following demo) are supported as well as simpler ones (add in the following demo).
#include <cstdio>
#include <algorithm>
using namespace std;
int main()
{
static const int myvec[] = { 1, 2, 3, 4, 5 };
int sum = 0;
auto add = [] (int a, int b) -> int { return a+b; };
auto output_and_sum = [&sum,add](int x) { sum = add(sum,x); printf ("%d\n", x); };
for_each (myvec, myvec+5, output_and_sum);
printf ("sum = %d\n", sum);
}
(I use printf instead of cout merely to make the assembly list more readable.)
ICC compiles this program (an option -std=c++0x is necessary, of course) and gives the correct resuls:
1 2 3 4 5 sum = 15
Also the two lambda functions are well inlined and optimized – variable sum, which is used by the lambda function by reference, is completely stored in a register, though it is easy to find several useless instructions in the assembly dump.
Some other less important (in my view) features (e.g. initializer_list) are not implemented yet. (BTW, I really had a hard time finding the “request noncommercial free license” link on Intel’s website, while I could see the link to “buy a commercial license” everywhere..)
Hopefully C++0x will be more successful than C99, which is filled with ugly and nobody-wants-to-implement features.
Premature optimization
By chys on May 9th, 2009is the root of all evil, according to Donald Knuth.
Prof. Knuth wrote in his paper Structured Programming with Go To Statements:
Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
From time to time, I knew I should have first used a naïve algorithm, not beginning optimizing until I was sure they worked, but I have always been violating the rule.
Tags: dev
install vs. cp; and mmap
By chys on May 8th, 2009If we hand write a Makefile, we should always stick to install instead of using cp for the installation commands. Not only is it more convenient, but it does things right (cp does things wrong).
For example, if we attempt to update /bin/bash, which is currently running, with “cp ... /bin/bash”, we get a “text busy” error. If we attempt to update /lib/libc.so.6 with “cp ... /lib/libc.so.6”, then we either get “text busy” (in ancient versions of Linux) or breaks each and every running program within a fraction of a second (in recent versions of Linux). install does the thing right in both situations.
The reason why cp fails is that it simply attempts to open the destination file in write-only mode and write the new contents. This causes problem because Linux (and all contemporary Unices as well as Microsoft Windows) uses memory mapping (mmap) to load executables and dynamic libraries.
The contents of an executable or dynamic library are mmap’d into the linear address space of relevant processes. Therefore, any change in the underlying file affects the mmap’d memory regions and can potentially break programs. (MAP_PRIVATE guarantees changes by processes to those memory regions are handled by COW without affecting the underlying file. On the contrary, POSIX leaves to implementations whether COW should be used if the underlying file is modified. In fact, for purpose of efficiency, in Linux, such modifications are visible to processes even though MAP_PRIVATE may have be used.)
There is an option MAP_DENWRITE which disallows any modification to the underlying file, designed to avoid situations described above. Executables and dynamic libraries are all mmap’d with this option. Unfortunately, it turned out MAP_DENYWRITE became a source of DoS attacks, forcing Linux to ignore this option in recent versions.
Executables are mmap’d by the kernel (in the execve syscall). For kernel codes, MAP_DENYWRITE still works, and therefore we get “text busy” errors if we attempt to modify the executable.
On the other hand, dynamic libraries are mmap’d by userspace codes (for example, by loaders like /lib/ld-linux.so). These codes still pass MAP_DENYWRITE to the kernel, but newer kernels silently ignores this option. The bad consequence is that you can break the whole system if you think you’re only upgrading the C runtime library.
Then, how does install solve this problem? Very simple – unlinking the file before writing the new one. Then the old file (no longer present in directory entries but still in disk until the last program referring to it exits) and the new file have different inodes. Programs started before the upgrading (continuing using the old file) and those after the upgrading (using the new version) will both be happy.
liblzmadec sucks
By chys on May 1st, 2009Both the executable lzma and the library liblzmadec are included in the package lzma-utils (in Gentoo).
If I compress a file a.txt like this:
lzma a.txt
then liblzmadec can decompress a.txt.lzma perfectly.
However, if I compress it like this:
lzma < a.txt > a.txt.lzma
Then liblzmadec fails.
LZMA is a compression algorithm whose application rate has been rapidly growing in the past year. In most cases, it has significantly better compression ratio compressed to bzip2 and gzip, and decompresses significantly faster. However, its compression process is extremely slow (probably 10 times longer the time than bzip2).
One of the algorithms supported by 7zip is LZMA. In Linux we usually use LZMA Utils instead.
GNU provided the tarball of some versions of coreutils in LZMA (in addition to the traditional GZIP), although they have recently switched to xz.
Tags: compression, dev
