Posts Tagged ‘Python’
A problem with pipes in Python 3
By chys on November 14th, 2011The most disturbing change from Python 2 to 3 definitely is not the print() function; nor that some functions which used to return lists now return iterators; nor the removal of __cmp__; but the transition to Unicode.
I’m completely supportive of the transition per se, but I’m disappointed that they’re trying to compel us to use Unicode by dropping useful functionalities for byte-streams/8-byte strings. For example, bytes has no format or % in Python 3.
I have some code like this:
proc = subprocess.Popen((....), stdout=subprocess.PIPE) for line in proc.stdout: ...
I found that, on Linux, this code snippet is almost 10 times slower in Python 3 than in Python 2. Then I strace‘d the code and found Python 3 is passing length 1 to read, incurring thousands of times more system calls than Python 2. Are you kidding me? I was forced to use something like proc.stdout.read(...).
I understand this is not the direct result of the transition to Unicode, but it is somehow related.
Integer division
By chys on May 25th, 2011C89 and C++98 say the result of an integer division where the divisor and/or dividend is negative is implementation defined. This reflects that early hardware implemented integer divisions differently.
According to C89/C++98, we may have either (-3)/2 == -1 (round toward zero) or (-3)/2 == -2 (round toward negative infinity).
It appears round toward zero has become the overwhelming de facto standard now, adopted by both hardware and software vendors. Now both C and C++ explicitly require round toward zero in their new standards (C99 and C++2011*).
Division of negative integers has always been a complicated problem. Fortran mandated the same round-toward-zero mode much earlier than C/C++; so did Java. Python, on the other hand, has required round toward -∞ (i.e. (-3)//2 == -2) from its beginning. Everybody, nevertheless, agrees that a/b*b + a%b == a should always hold.
* C++0x has yet to be officially approved. Hopefully it will be approved within this year and known as C++2011. I’m using this name prematurely.
A Python sort trick
By chys on October 23rd, 2009>>> a = [12,14,133,130,176,25,54,79,127] >>> b = sorted(range(len(a)),key=a.__getitem__) >>> b [0, 1, 5, 6, 7, 8, 3, 2, 4] >>> map(a.__getitem__,b) [12, 14, 25, 54, 79, 127, 130, 133, 176]
b is the so called sort index.
Reference
http://www.newsmth.net/bbscon.php?bid=284&id=59819
Tags: Python
__cmp__ should be reintroduced to Python
By chys on August 11th, 2009A lot of people think this decision to drop __cmp__ from Python sucks. (1 2)
The cmp argument is also dropped from list.sort and sorted. Some people explain that some programmers used to use cmp when it is more convenient and efficient to use key. By dropping cmp, poor programmers are forced to use key. On the other hand, almost all comparators found in practice are actually based on some kind of key function.
Well, theoretically this probably is true: something.sort(key=locale.strxfrm) is better than something.sort(cmp=locale.strcoll).
BUT, what can I do if I only have a third-party comparator?
I have a Python script where I use vercmp, provided in the source of Gentoo Portage, to sort a list of Gentoo packages. No corresponding verkey is provided. So now, in Python 3, I have to either write a verkey myself, or use the following ugly and inefficient workaround:
class VerKey:
def __init__(self,str):
self.str = str
def __lt__(self,other):
return vercmp(self.str, other.str) < 0
versions.sort (key = VerKey)
Tags: Python
