Trying out X32 ABI

The x32 ABI is a variant of x86-64 ABI which uses 32-bit pointers (thus limiting the address space of a process to 4 GiB). Today I found I had the mood to try it out, but soon discovered it was so challenging…

  • I had been aware that my GCC 4.7 and 4.8 (trunk) could create x32 object files, but I don’t have a x32 version of glibc installed. To install it, I needed to set GCC 4.7 as the default compiler, which was too risky for me. (Gentoo Bugzilla still has many unresolved bugs “*** fails to build with gcc-4.7″. I’m willing to try new things in virtual machines, but my host system must be stable.)
  • Recalling Gentoo released a x32 Relese Candidate, I guessed it might be a good idea to install it in VirtualBox. But soon, I found I was unable to chroot into my new system to finalize the installation – all my LiveCDs are too old to support x32!
    • The most recent Gentoo LiveCD obviously used a 3.2 kernel, while x32 requires 3.4.
    • Suddenly Arch Linux came to my mind. Arch is known for shipping bleeding-edge packages. I downloaded its latest ISO. Its kernel version was right (3.4), but x32 binaries insisted on segfaulting. — The kernel was built with CONFIG_X32=OFF!
    • Then I tried the Ubuntu 12.10 Alpha 3 LiveCD, but It only provides a busybox shell, without LVM support.
  • OK. Should I make a regular amd64 installation, and then install x32 from there? Or should I build a kernel and an ISO myself? I didn’t want to spend more time on it. Finally I decided to just create a chroot environment for x32.

AMD must have anticipated the x32 ABI at the time they were drafting AMD64*. Otherwise, it’s impossible to explain why they explicitly allow 32-bit addressings (0×67 prefix) in 64-bit long mode.

* Also known as x86-64 since Intel copied it and blatantly called it Intel 64 Architecture (different from IA-64 Architecture).

Tags: , , ,

ELF Symbol table

I’ve recently been doing some hacking (for food). I need to modify some object files (.o) generated by GCC before they’re passed to linker. (Using elfutils)

I thought I was doing everything right, but ld (GNU linker) insisted on crashing with segmentation fault. Finally, eu-elflint told me what was wrong – I was unaware of the ELF requirement that LOCAL symbols must come before GLOBAL symbols in a symtab (symbol table), and that sh_info is the boundary between LOCAL and GLOBAL symbols.

Many thanks to authors of elfutils, especially for their eu-elflint!

Tags: , , , ,

Increasing the size of a VirtualBox vdi virtual drive

VBoxManage modifyhd Win7.vdi --resize SizeInMiB
Then go into the guest OS and create new partitions or extend existing partitions.

Tags:

A problem with pipes in Python 3

The most disturbing change from Python 2 to 3 definitely is not the print() function; nor that some functions which used to return lists now return iterators; nor the removal of __cmp__; but the transition to Unicode.

I’m completely supportive of the transition per se, but I’m disappointed that they’re trying to compel us to use Unicode by dropping useful functionalities for byte-streams/8-byte strings. For example, bytes has no format or % in Python 3.

I have some code like this:

proc = subprocess.Popen((....), stdout=subprocess.PIPE)
for line in proc.stdout:
    ...

I found that, on Linux, this code snippet is almost 10 times slower in Python 3 than in Python 2. Then I strace‘d the code and found Python 3 is passing length 1 to read, incurring thousands of times more system calls than Python 2. Are you kidding me? I was forced to use something like proc.stdout.read(...).

I understand this is not the direct result of the transition to Unicode, but it is somehow related.

Tags: ,

Intel announces AVX2

The documentation is available for download.

The instruction set war is still there – Intel still doesn’t plan to support many XOP features of AMD; also Intel still plans to use FMA3 while AMD uses FMA4. Nevertheless, this time Intel is at least not making the war even worse. In addition to extending most SSE2/SSE3/SSE4 instructions to 256 bits (this is no surprise), they copied BMI (with an extension called BMI2) and CVT16 from AMD. If I recall correctly, Intel had never copied so many instructions from AMD at once, with the notable exception of x86-64.

Tags: ,

Integer division

C89 and C++98 say the result of an integer division where the divisor and/or dividend is negative is implementation defined. This reflects that early hardware implemented integer divisions differently.

According to C89/C++98, we may have either (-3)/2 == -1 (round toward zero) or (-3)/2 == -2 (round toward negative infinity).

It appears round toward zero has become the overwhelming de facto standard now, adopted by both hardware and software vendors. Now both C and C++ explicitly require round toward zero in their new standards (C99 and C++2011*).

Division of negative integers has always been a complicated problem. Fortran mandated the same round-toward-zero mode much earlier than C/C++; so did Java. Python, on the other hand, has required round toward -∞ (i.e. (-3)//2 == -2) from its beginning. Everybody, nevertheless, agrees that a/b*b + a%b == a should always hold.

* C++0x has yet to be officially approved. Hopefully it will be approved within this year and known as C++2011. I’m using this name prematurely.

Tags: , ,

A hack to strace -f

I have a multithreaded program which I would like to strace for debugging purpose. My program sometimes calls (fork and exec) an external program, which in turn calls a setuid program.

Because my program is multithreaded, I cannot omit the “-f” flag (also trace child threads and processes) when using strace. And because all children, including the setuid program, are traced, setuid fails. (Yes, I am aware that strace claims it is possible to trace setuid programs, but the trick does not work for me, probably because the setuid program is not directly executed by strace.)

Fortunately, the clone system call has many useful flags. It works fine for me when I substitute calls to fork() with:

(pid_t) syscall (__NR_clone, CLONE_UNTRACED|SIGCHLD, NULL);

(Yes, SIGCHLD, not CLONE_SIGCHLD. It’s not a typo.)

I guess there may be better solutions, without modifying the program being traced?

Tags: , ,

Concatenate PDF files in Linux

Some people recommend using convert or gs. However, there is a major problem with them – all text and vector graphics become raster graphics.

pdftk (PDF ToolKit) is a better solution – it keeps text and vector graphics. We just have to use this command:

pdftk *.pdf cat output result.pdf

PDF Shuffler, written in Python and based on poppler, also does the trick, and has a nice GUI.

However, there are drawbacks for both pdftk and PDF Shuffler:

  • pdftk only supports ASCII filenames. So it’s a bit inconvenient for non-English users like me.
  • PDF Shuffler is way too slow. I tried concatenating several files (approx. 1000 pages), and it kept running for more than 10 minutes before I hit Ctrl-C; pdftk finished the same task in just a few seconds.

Tags:

.note.GNU-stack

GCC always appends one line to any assembler file (.s) file it generates:

	.section	.note.GNU-stack,"",@progbits

Literally, it adds an empty section named .note.GNU-stack to the object file, but it actually serves a hint to the linker* that code in this object file does not require an executable stack. GNU assembler also accepts command-line option “--noexecstack”, which has the same effect.

If every object file contains a section of this name, the linker knows the whole program does not need an executable stack, and the resulting executable will run with a non-executable stack if the OS and underlying hardware support it (see also NX bit).

Why is this important? In practice, virtually no program needs an executable stack (hackers may sometimes use it, though), but buffer overflow attacks frequently insert and run code in stacks. A non-executable stack helps improve security without any overhead.

* GNU linker only.

Tags: ,

Convert PDF to images in Linux

Just use convert, the universal image converter shipped with ImageMagick.

convert a.pdf a.png

And we get as many PNG files as there are pages in the PDF. They converted files are named a-0.png, a-1.png, …

We can also use it the other way around:

convert a.jpg b.png c.gif abc.pdf

This will combine the three images into one PDF file. Very flexible.

Tags: ,