środa, 29 lipca 2015

Python: palindrome number test challenge

Recently, in one of the check.io missions i was playing with a task to find first palindromic prime number (palprime) greater than given argument. The challenge was to make code as small as possible. To check whether a number is a palprime we need to check whether it is both a prime and a palindromic number. In this post I'd like to focus on checking whether a number is palindromic.

Spoiler alert: think twice before reading this post if you're going to play the game.

(source: http://40.media.tumblr.com/42ae198800aa76ad3c06878fd93a2d82/tumblr_nf8x7keJkF1rkt29io1_500.png)

I believe most Python devs out there would quickly come with intuitive solution, presented below.
The approach here is pretty straightforward - instead of doing arithmetics (like here) we can simply convert the number to a string and then compare it with reversed version of it. The solution consists of 24 characters. Can we make it smaller?

(source: http://gemssty.com/wp-content/uploads/2013/03/cardboard_phone_2.jpg)

It will be obvious to seasoned Python programmers to use fabulous Python slices, as shown below.
With this simple transformation we saved 4 characters - that's nice. How does it work? Python slices allow specifying start, stop, and step. If we omit start and stop and provide -1 as a step slicing will work in magical way: it will go through the sequence in right-to-left direction. As a side note I can add that slices in Python work in a little weird way - e.g. "abcd"[0:len("abcd"):-1] gives "" in result while "abcd"[len("abcd"):0:-1] renders "dcb". [::-1] specifying start and stop parameters would require us to append firs character to the result: "abcd"[len("abcd"):0:-1]+"a".

So, going back to the topic, are we stuck with this solution? If we can invert the logic (that won't be always the case), then there's other way that I discovered during trials. It required me to adapt rest of the code, but no single character was introduced. Instead, I saved one more.
In essence, the idea is that if the number is palindromic then subtracting reversed version of it from the original version should result in 0. Here we are converting to string just to reverse the number and then there's back-to-integer conversion.

So I ended up with 19 characters thinking about what else I can do. I literally ran out of ideas, so I though of googling for operators available in Python language. During my life I managed to ensure myself that even if you think you have mastered your language of choice, you are wrong (C++ helped me to shape this attitude with stuff like template unions, secret shared_ptr c-tors etc. ;)).

In the haystack of results I found a needle. It turns out that in Python 2.x there's a possibility to convert a number (even if it's held in a variable) to a string using backticks (a.k.a backquotes, reverse quotes). Well... what can I say. I never expected existence of such thing, especially in Python.

Fortunately it does the job, so was able to use this feature and cut down length of the test to 14. Awesome!
Above solution is the shortest I was able to come with. I wouldn't be surprised if there were shorter ones, though. If you know how to make it shorter, please let me know in comments.

Besides making this check so short I was curious what are differences between conversions and reversions in terms of performance. I decided to give it a go on my machine (i5-3320M 2.6GHz).

python -m timeit 'n=2**30; str(n)'
python -m timeit 'n=2**30; repr(n)'
python -m timeit 'n=2**30; `n`'

python -m timeit 's=qwertyuiopasfghjklzxcvbnm; rs=''.join(reversed(s))'
python -m timeit 's=qwertyuiopasfghjklzxcvbnm; rs=s[::-1]'
My conclusions are:
  • there's very slight (almost unnoticeable) difference between three methods of converting integer to string
  • for some reason Python3.4 is slower than Python2.7
  • Pypy cruelly outperforms Python, as usual. If you care about performance you should use pypy.
  • there's almost no difference between reverse(n) and [::-1] when it comes to speed
The fact that there are no performance differences between reverse() and [::-1] surprised me a bit. By looking into byte code I can only assume that in my particular case different set of operations yielded the same performance results.

reversed                             [::-1]
 0 LOAD_CONST    0 ('qwer...')        
0 LOAD_CONST    0 ('qwer...')
 3 STORE_NAME    0 (s)                3 STORE_NAME    0 (s)
 6 LOAD_CONST    1 ('')               6 LOAD_NAME     0 (s)
 9 LOAD_ATTR     1 (join)             9 LOAD_CONST    1 (None)
12 LOAD_NAME     2 (reversed)        12 LOAD_CONST    1 (None)
15 LOAD_NAME     0 (s)               15 LOAD_CONST    3 (-1)
18 CALL_FUNCTION 1 (1 pos, 0 kw)     18 BUILD_SLICE   3
21 CALL_FUNCTION 1 (1 pos, 0 kw)     21 BINARY_SUBSCR
24 STORE_NAME    3 (rs)              22 STORE_NAME    1 (rs)
27 LOAD_CONST    2 (None)            25 LOAD_CONST    1 (None)
30 RETURN_VALUE                      28 RETURN_VALUE

In case of conversion from integer to a string I anticipated that str and repr will yield similar results. What I didn't know was how backticks will perform. By looking into byte code I can say that the only difference is that backticks don't call any function, but use UNARY_CONVERT op directly.

str                                  ``
 0 LOAD_CONST    0 (1234567890)       
0 LOAD_CONST            0 (1234567890)
 3 STORE_NAME    0 (n)                3 STORE_NAME               0 (n)
 6 LOAD_NAME     1 (str)              6 LOAD_NAME                0 (n)
 9 LOAD_NAME     0 (n)                9 UNARY_CONVERT
12 CALL_FUNCTION 1                   10 STORE_NAME               1 (s)
15 STORE_NAME    2 (s)               13 LOAD_CONST               1 (None)
18 LOAD_CONST    1 (None)            16 RETURN_VALUE

2 komentarze:

  1. back tick is the equivalent of `repr()`, so you can't use this on extremely large integers. otherwise there will be a 'L' at the tail of the number.

    1. it's not exact equivalent, but yes, for numbers bigger than sys.maxint it yields extra 'L' suffix