&errno as thread identifier
By chys on March 13th, 2010
Thread identifiers may have different types in different systems – pthread_t under POSIX; uintptr_t or HANDLE under Windows. They are usually integers or pointers, but POSIX actually allows structures though not usually found. This leads to a little difficulty in portability.
A trick, used by OpenSSL (if not overridden by calling CRYPTO_set_id_callback), is to use &errno as the thread identifier.
Per C standard, errno must be a modifiable lvalue, and thus we can safely take its address – &errno. And under any practically usable thread implementation, errno must be thread-local, which means &errno is different in different threads.
Note: It seems OpenSSL defaults to &errno only since 0.9.8m. Earlier versions always require the user to call CRYPTO_set_id_callback.
Tags: multithread, portability
Acquire and Release Semantics
By chys on March 2nd, 2010The concept of acquire and release semantics is important for multi-threaded programs that run on more than one physical core or processor. MSDN has a clear and concise explanation of then.
Consider the following code example:
a++;
b++;
c++;From another processor’s point of view, the preceding operations can appear to occur in any order. For example, the other processor might see the increment of
bbefore the increment ofa.…
[T]he
InterlockedIncrementAcquireroutine uses acquire semantics to increment a variable. If you rewrote the preceding code example as follows:
InterlockedIncrementAcquire(&a);
b++;
c++;other processors would always see the increment of
abefore the increments ofbandc.Likewise, the
InterlockedIncrementReleaseroutine uses release semantics to increment a variable. If you rewrote the code example once again, as follows:
a++;
b++;
InterlockedIncrementRelease(&c);other processors would always see the increments of
aandbbefore the increment ofc.
The operation of acquiring a lock must have acquire semantics; and the operation of releasing a lock must have release semantics. This is probably where they get their names.
Tags: dev, multithread
Successfully filed a bug for GCC
By chys on March 1st, 2010http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43215
This is the third time I encounter a real bug in GCC, the second time in a non-experimental component, and the first time I am the first to report. Unfortunately I’m unable to fix it, but other people have:
Index: i386.md
===================================================================
--- i386.md (revision 157132)
+++ i386.md (working copy)
@@ -3245,7 +3245,7 @@
case 9:
case 10:
- return "%vmovd\t{%1, %0|%0, %1}";
+ return "%vmovq\t{%1, %0|%0, %1}";
default:
gcc_unreachable();
Tags: GCC, opensource
I like this feature
By chys on February 12th, 2010It’s in the newly released KDE SC 4.4*. We can now “group” windows into one tabbed interface.
* It seems they have recently decided their product has grown from a “desktop environment” to a “software compilation”… It also seems they have decided that “KDE,” previously standing for “K Desktop Environment,” is now a name in its own right?
Tags: KDE
The Tuesday following the first Monday in November
By chys on February 6th, 2010Martin Luther King Day falls the third Monday of January; Daylight Saving Time begins on the second Sunday of March; Father’s Day falls on the third Sunday in June; …
To represent these dates in a program, the intuitive method is to use tuples (3,1,1), (2,0,3), (3,0,6), respectively. But probably this is not the best idea. The number of representable observations is limited. For instance, we cannot represent Election Day (Tuesday following the first Monday of November) or Memorial Day (last Monday of May).
Nevertheless, it is still possible to represent all the above in three small integers, by replacing the first integer in the tuple with the earliest possible day of month. For instance,
Martin Luther King Day = the Monday between January 15 and 21, represented by (15,1,1)
Election Day = the Tuesday between November 2 and 8, represented by (2,2,11)
Thanksgiving (before 1939) = last Thursday of November = the Thursday between November 24 and 30, represented by (24,4,11)
Thanksgiving (modern) = fourth Thursday of November = the Thursday between November 22 and 28, represented by (22,4,11)
Tags: dev
Extract Deb files from command line
By chys on February 2nd, 2010Debian and its derivatives use the .deb format to distribute their packages. To extract them, use ar – Yes, the very program we programmers use to make static libraries.
ar x sudo_1.6.9p17-2_i386.deb
Or we can directly extract things from data.tar.gz contained in the .deb file:
ar p sudo_1.6.9p17-2_i386.deb data.tar.gz | tar -xzf -
No longer a user of Debian GNU/Linux, I still have to remember how to extract .deb files. I frequently need to cross-compile a 64-bit version of my program on a 32-bit system, and vice versa; but I don’t want to cross-compile by myself so many libraries on which my program depends. Instead, I find it a good idea to download a right .deb file from the Debian Packages Repository and pick out the .so files.
Tags: compression, Debian, Gentoo, Linux
PDP-endian
By chys on January 13th, 2010Today I checked <endian.h> and found an unfamiliar line:
#define __PDP_ENDIAN 3412
I knew there were little-endian machines (e.g. all Intel CPUs*) and big-endian machines (e.g. PowerPC†), but was really unaware of the so-called PDP-endian. It’s word-wise big-endian, and within a word‡, little-endian.
char c[5] = {};
*(int32_t *)c = 0x61626364;
puts (c);
ASCII supposed, this program segment produces “ABCD” on a big-endian machine, and “DCBA” on a little-endian machine. It seems, on a PDP-11, it should output “BADC”.
Little-endian is friendly to programmers (in some sense). Big-endian is intuitive. But what’s PDP-endian for?
* Itanium supports both little-endian and big-endian. Windows and Linux for Itanium both use little endian.
† PowerPC supports both little-endian and big-endian. Macintosh used big endian.
‡ The term “word” here stands for two bytes. This usage is believed to be wrong though used by Intel, but I just don’t find a good substitute.
Tags: dev
Unaligned access
By chys on December 26th, 2009Misalignment is not an error (only incurs a performance penalty) on x86 processors except for a few new instructions added in recent years. MOVDQA, for example, is an SSE2 instruction requiring alignment on 16-byte boundaries.
Textbooks have normally taught us we get a bus error if a CPU which disallows unaligned access actually encounters one.
But we observe a Linux process passing misaligned addresses to MOVDQA receives SIGSEGV (segmentation fault) instead of SIGBUS (bus error), on both ia32 and x86-64.
laptop /tmp $ cat a.c
int main ()
{
char X[32];
asm ("pxor %%xmm0,%%xmm0; movdqa %%xmm0,%0" : "=m"(X[1]) :: "xmm0");
return 0;
}
laptop /tmp $ gcc -msse2 a.c
laptop /tmp $ ./a.out
Segmentation fault
laptop /tmp $ kill -l $?
SEGV
x86-64 (and ia32 beginning 80486SX) supports disallowing any misaligned access*. In that case, a normal instruction raises SIGBUS, but instructions which inherently requires alignment (e.g. MOVDQA) still raises SIGSEGV. It’s not so consistent.
* It is normally disabled. To enable it, set the AC bit in FLAGS:
pushf
or $0x40000,(%esp)(or%rspon x86-64)
popf
Floating point exception
By chys on December 1st, 2009It is already confusing enough that “floating point exception” may mean “division by zero” in integral arithmetic. It turns out it can also mean “overflow” in some cases, as in the following program (it’s difficult in C, so I had to use assembly):
#include <asm/unistd.h>
.code:
.globl _start
_start:
mov $1, %eax
mov $1, %edx
div %eax
mov $__NR_exit_group, %eax
int $0x80
(Type “gcc -m32 -nostdlib a.S” to compile and link.)
In this program, EDX:EAX (0x100000001) divided by ECX (0x1) cannot be represented in 32-bit integer and thus it is an overflow. X86 CPUs raise a “division by zero” interruption (int 0) in such cases, and “division by zero” is displayed as “floating point exception” in Linux…
PS. The same assembly program in Intel style:
.code
.startup
MOV EAX,1
MOV EDX, 1
DIV EAX
MOV EAX, __NR_exit_group
INT 80H
The clock() function
By chys on November 26th, 2009The ISO C standard specifies that
The clock function determines the processor time used.
It is clear that the result should be processor time instead of wall-clock time (real time).
It turns out that clock() in Microsoft C does return the wall-clock time instead of processor time.
I do understand Microsoft probably were not intentionally trying to violate the standard. It is meaningless to talk about processor time in DOS (nor does DOS provide any mechanism to measure processor time, afaik), and many programs used clock() to measure real time even if there were lots of system calls, disk accesses, etc. (which would make processor time significantly differ from real time in time-sharing systems). Probably Microsoft intended to maintain this “compatibility.” But is this really necessary? They could have corrected this either during the migration from single-task DOS to time-sharing Windows, or from 16-bit Windows 3 to 32-bit Windows 95/NT – just one more “incompatibility,” compared to other huge differences, not so important, was it?

