Posts Tagged ‘Python’
The most disturbing change from Python 2 to 3 definitely is not the
print() function; nor that some functions which used to return lists now return iterators; nor the removal of
__cmp__; but the transition to Unicode.
I’m completely supportive of the transition per se, but I’m disappointed that they’re trying to compel us to use Unicode by dropping useful functionalities for byte-streams/8-byte strings. For example,
bytes has no
% in Python 3.
I have some code like this:
proc = subprocess.Popen((....), stdout=subprocess.PIPE) for line in proc.stdout: ...
I found that, on Linux, this code snippet is almost 10 times slower in Python 3 than in Python 2. Then I strace‘d the code and found Python 3 is passing length 1 to
read, incurring thousands of times more system calls than Python 2. Are you kidding me? I was forced to use something like
I understand this is not the direct result of the transition to Unicode, but it is somehow related.
C89 and C++98 say the result of an integer division where the divisor and/or dividend is negative is implementation defined. This reflects that early hardware implemented integer divisions differently.
According to C89/C++98, we may have either
(-3)/2 == -1 (round toward zero) or
(-3)/2 == -2 (round toward negative infinity).
It appears round toward zero has become the overwhelming de facto standard now, adopted by both hardware and software vendors. Now both C and C++ explicitly require round toward zero in their new standards (C99 and C++2011*).
Division of negative integers has always been a complicated problem. Fortran mandated the same round-toward-zero mode much earlier than C/C++; so did Java. Python, on the other hand, has required round toward -∞ (i.e.
(-3)//2 == -2) from its beginning. Everybody, nevertheless, agrees that
a/b*b + a%b == a should always hold.
* C++0x has yet to be officially approved. Hopefully it will be approved within this year and known as C++2011. I’m using this name prematurely.
>>> a = [12,14,133,130,176,25,54,79,127] >>> b = sorted(range(len(a)),key=a.__getitem__) >>> b [0, 1, 5, 6, 7, 8, 3, 2, 4] >>> map(a.__getitem__,b) [12, 14, 25, 54, 79, 127, 130, 133, 176]
b is the so called sort index.
cmp argument is also dropped from
sorted. Some people explain that some programmers used to use
cmp when it is more convenient and efficient to use
key. By dropping
cmp, poor programmers are forced to use
key. On the other hand, almost all comparators found in practice are actually based on some kind of key function.
Well, theoretically this probably is true:
something.sort(key=locale.strxfrm) is better than
BUT, what can I do if I only have a third-party comparator?
I have a Python script where I use
vercmp, provided in the source of Gentoo Portage, to sort a list of Gentoo packages. No corresponding
verkey is provided. So now, in Python 3, I have to either write a
verkey myself, or use the following ugly and inefficient workaround:
class VerKey: def __init__(self,str): self.str = str def __lt__(self,other): return vercmp(self.str, other.str) < 0 versions.sort (key = VerKey)