Posts Tagged ‘assembly’

Unaligned access

Misalignment is not an error (only incurs a performance penalty) on x86 processors except for a few new instructions added in recent years. MOVDQA, for example, is an SSE2 instruction requiring alignment on 16-byte boundaries.

Textbooks have normally taught us we get a bus error if a CPU which disallows unaligned access actually encounters one.

But we observe a Linux process passing misaligned addresses to MOVDQA receives SIGSEGV (segmentation fault) instead of SIGBUS (bus error), on both ia32 and x86-64.

laptop /tmp $ cat a.c
int main ()
{
    char X[32];
    asm ("pxor %%xmm0,%%xmm0; movdqa %%xmm0,%0" : "=m"(X[1]) :: "xmm0");
    return 0;
}
laptop /tmp $ gcc -msse2 a.c
laptop /tmp $ ./a.out
Segmentation fault
laptop /tmp $ kill -l $?
SEGV

x86-64 (and ia32 beginning 80486SX) supports disallowing any misaligned access*. In that case, a normal instruction raises SIGBUS, but instructions which inherently requires alignment (e.g. MOVDQA) still raises SIGSEGV. It’s not so consistent.

* It is normally disabled. To enable it, set the AC bit in FLAGS:

pushf
or $0x40000,(%esp) (or %rsp on x86-64)
popf

Tags: , , ,

Floating point exception

It is already confusing enough that “floating point exception” may mean “division by zero” in integral arithmetic. It turns out it can also mean “overflow” in some cases, as in the following program (it’s difficult in C, so I had to use assembly):

#include <asm/unistd.h>
.code:
.globl _start
_start:
    mov $1, %eax
    mov $1, %edx
    div %eax
    mov $__NR_exit_group, %eax
    int $0x80

(Type “gcc -m32 -nostdlib a.S” to compile and link.)

In this program, EDX:EAX (0x100000001) divided by ECX (0x1) cannot be represented in 32-bit integer and thus it is an overflow. X86 CPUs raise a “division by zero” interruption (int 0) in such cases, and “division by zero” is displayed as “floating point exception” in Linux…


PS. The same assembly program in Intel style:

.code
.startup
    MOV EAX,1
    MOV EDX, 1
    DIV EAX

    MOV EAX, __NR_exit_group
    INT 80H

Tags: ,

Every C programmer should learn some assembly

I am more convinced of this now.

One of the most frequently asked questions in C is the difference between a pointer and an array. A newbie in C often finds it “mission impossible” to differentiate between the following four variable types:
char p1[][8] = { "Hello", "world" };
char *p2[8] = { "Hello", "world" };
char (*p3)[8] = p1;
char **p4 = p2;

And it really is difficult to explain it clearly in a few words. However, if one knows some assembly, one can check the assembly listing generated by an assemblera compiler and at least the difference between p1 and p2 should be straightforward:

p1:
    .string "Hello"
    .zero 2
    .string "world"
    .zero 2
.LC0:
    .string "Hello"
.LC1:
    .string "world"
p2:
    .long .LC0
    .long .LC1
p3:
    .long p1
p4:
    .long p2

(I prefer the AT&T-style assembly)

I feel so lucky that I had learned some assembly used in NES before starting C. So for me “pointer” has always been a very natural concept and surely different from an array. Many poor freshmen undergrads had to begin with C++ without any knowledge in assembly or C or even any other language – I would have been crazy had I been under such a situation.

Tags: ,

mov %edi, %edi

Here is a simple C function:

long foo (unsigned a, unsigned b)
{
    return ((long)b<<32)|a;
}

Compile it with an x86-64-targeted GCC with proper optimizations enabled (-O2 for example), you get the following instructions (in AT&T-style assembly):

foo:
        movq    %rsi, %rax
        mov     %edi, %edi
        salq    $32, %rax
        orq     %rdi, %rax
        ret

Pay attention to the red line. Literally it means assigning the value of register edi to register edi. Five years ago, anybody would agree this instruction does nothing like nops. But in an x86-64 system, this is not the case.

In x86-64 assembly, any instruction with a 32-bit register as its destination zeroes the higher 32 bits of the corresponding 64-bit register at the same time. Consequently, the function of ‘mov %edi, %edi’ is zeroing bits 32 to 63 of register rdi while leaving the lower 32 bits (i.e., register edi) unchanged.

One may want to rewrite it with a more intuitive and instruction:

andq $0xffffffff, %rdi

But this does NOT assemble! Because $0×00000000ffffffff is not representable in signed 32-bit format, but 64-bit immediates are currently allowed only in mov instructions whose destination is a general-purpose register (such a mov is usually explicitly written as movabsq). So if one must use and, one need something like this:

movl $0xffffffff, %eax
andq %rax, %rdi

Remember the zeroing rule for operations on 32-bit registers, so ‘movl $0xffffffff, %eax’ is equivalent to ‘movabsq $0xffffffff, %rax’…

X86-64 assembly really is too ugly, at least in this sense…

Reference
[1] Gentle Introduction to x86-64 Assembly

Tags: ,