Posts Tagged ‘assembly’

Intel announces AVX2

The documentation is available for download.

The instruction set war is still there – Intel still doesn’t plan to support many XOP features of AMD; also Intel still plans to use FMA3 while AMD uses FMA4. Nevertheless, this time Intel is at least not making the war even worse. In addition to extending most SSE2/SSE3/SSE4 instructions to 256 bits (this is no surprise), they copied BMI (with an extension called BMI2) and CVT16 from AMD. If I recall correctly, Intel had never copied so many instructions from AMD at once, with the notable exception of x86-64.

Tags: ,

.note.GNU-stack

GCC always appends one line to any assembler file (.s) file it generates:

	.section	.note.GNU-stack,"",@progbits

Literally, it adds an empty section named .note.GNU-stack to the object file, but it actually serves a hint to the linker* that code in this object file does not require an executable stack. GNU assembler also accepts command-line option “--noexecstack”, which has the same effect.

If every object file contains a section of this name, the linker knows the whole program does not need an executable stack, and the resulting executable will run with a non-executable stack if the OS and underlying hardware support it (see also NX bit).

Why is this important? In practice, virtually no program needs an executable stack (hackers may sometimes use it, though), but buffer overflow attacks frequently insert and run code in stacks. A non-executable stack helps improve security without any overhead.

* GNU linker only.

Tags: ,

Unaligned access

Misalignment is not an error (only incurs a performance penalty) on x86 processors except for a few new instructions added in recent years. MOVDQA, for example, is an SSE2 instruction requiring alignment on 16-byte boundaries.

Textbooks have normally taught us we get a bus error if a CPU which disallows unaligned access actually encounters one.

But we observe a Linux process passing misaligned addresses to MOVDQA receives SIGSEGV (segmentation fault) instead of SIGBUS (bus error), on both ia32 and x86-64.

laptop /tmp $ cat a.c
int main ()
{
    char X[32];
    asm ("pxor %%xmm0,%%xmm0; movdqa %%xmm0,%0" : "=m"(X[1]) :: "xmm0");
    return 0;
}
laptop /tmp $ gcc -msse2 a.c
laptop /tmp $ ./a.out
Segmentation fault
laptop /tmp $ kill -l $?
SEGV

x86-64 (and ia32 beginning 80486SX) supports disallowing any misaligned access*. In that case, a normal instruction raises SIGBUS, but instructions which inherently requires alignment (e.g. MOVDQA) still raises SIGSEGV. It’s not so consistent.

* It is normally disabled. To enable it, set the AC bit in FLAGS:

pushf
or $0x40000,(%esp) (or %rsp on x86-64)
popf

Tags: , , ,

Floating point exception

It is already confusing enough that “floating point exception” may mean “division by zero” in integral arithmetic. It turns out it can also mean “overflow” in some cases, as in the following program (it’s difficult in C, so I had to use assembly):

#include <asm/unistd.h>
.code:
.globl _start
_start:
    mov $1, %eax
    mov $1, %edx
    div %eax
    mov $__NR_exit_group, %eax
    int $0x80

(Type “gcc -m32 -nostdlib a.S” to compile and link.)

In this program, EDX:EAX (0x100000001) divided by ECX (0x1) cannot be represented in 32-bit integer and thus it is an overflow. X86 CPUs raise a “division by zero” interruption (int 0) in such cases, and “division by zero” is displayed as “floating point exception” in Linux…


PS. The same assembly program in Intel style:

.code
.startup
    MOV EAX,1
    MOV EDX, 1
    DIV EAX

    MOV EAX, __NR_exit_group
    INT 80H

Tags: ,

Every C programmer should learn some assembly

I am more convinced of this now.

One of the most frequently asked questions in C is the difference between a pointer and an array. A newbie in C often finds it “mission impossible” to differentiate between the following four variable types:
char p1[][8] = { "Hello", "world" };
char *p2[8] = { "Hello", "world" };
char (*p3)[8] = p1;
char **p4 = p2;

And it really is difficult to explain it clearly in a few words. However, if one knows some assembly, one can check the assembly listing generated by an assemblera compiler and at least the difference between p1 and p2 should be straightforward:

p1:
    .string "Hello"
    .zero 2
    .string "world"
    .zero 2
.LC0:
    .string "Hello"
.LC1:
    .string "world"
p2:
    .long .LC0
    .long .LC1
p3:
    .long p1
p4:
    .long p2

(I prefer the AT&T-style assembly)

I feel so lucky that I had learned some assembly used in NES before starting C. So for me “pointer” has always been a very natural concept and surely different from an array. Many poor freshmen undergrads had to begin with C++ without any knowledge in assembly or C or even any other language – I would have been crazy had I been under such a situation.

Tags: ,

mov %edi, %edi

Here is a simple C function:

long foo (unsigned a, unsigned b)
{
    return ((long)b<<32)|a;
}

Compile it with an x86-64-targeted GCC with proper optimizations enabled (-O2 for example), you get the following instructions (in AT&T-style assembly):

foo:
        movq    %rsi, %rax
        mov     %edi, %edi
        salq    $32, %rax
        orq     %rdi, %rax
        ret

Pay attention to the red line. Literally it means assigning the value of register edi to register edi. Five years ago, anybody would agree this instruction does nothing like nops. But in an x86-64 system, this is not the case.

In x86-64 assembly, any instruction with a 32-bit register as its destination zeroes the higher 32 bits of the corresponding 64-bit register at the same time. Consequently, the function of ‘mov %edi, %edi’ is zeroing bits 32 to 63 of register rdi while leaving the lower 32 bits (i.e., register edi) unchanged.

One may want to rewrite it with a more intuitive and instruction:

andq $0xffffffff, %rdi

But this does NOT assemble! Because $0x00000000ffffffff is not representable in signed 32-bit format, but 64-bit immediates are currently allowed only in mov instructions whose destination is a general-purpose register (such a mov is usually explicitly written as movabsq). So if one must use and, one need something like this:

movl $0xffffffff, %eax
andq %rax, %rdi

Remember the zeroing rule for operations on 32-bit registers, so ‘movl $0xffffffff, %eax’ is equivalent to ‘movabsq $0xffffffff, %rax’…

X86-64 assembly really is too ugly, at least in this sense…

Reference
[1] Gentle Introduction to x86-64 Assembly

Tags: ,