środa, 9 grudnia 2015

Obvious assumptions can result in bugs in software: Flask in embedded world

So the other day our team at Nokia was requested to implement web user interface for some embedded device. Usually I write in C++, but I was excited to take this task because we were able to put Python in place. Of course it's also possible to write web applications purely in C++ (e.g. with Cheerp - I encourage you to at least read about this project), but only thin back-end was required, so going with C++ didn't seem to be good choice. After cross-compiling Python to ARM (we anticipated that knowing how to do this would be common knowledge in the Internet, but again there were few resources) it was a time to pick some framework. We decided to use Flask - small cousin of Django, because

  • it's smaller than Django and we were already going to eat a lot of disk space with Python itself
  • it seemed like a perfect candidate for the task, because it looked so simple

I had my own goal as well :) I didn't know Flask, so there was something new to learn.

Fast forward to a place when I was about to deploy my application on the target device (after long struggle to have everything cross-compiled). Everything was set up, so I eagerly clicked the button in the front-end (react.js-based stuff) to prove that the back-end is working as well. As you can imagine - it didn't work.

http://thenextweb.com/wp-content/blogs.dir/1/files/2015/09/fn63697_01.jpg
Following is what I saw in logs. Some paths were masked, just-in-case. So after looking at this I thought to myself what the... It was working on my x86 machine!


In the rest of this post I'd like to describe a reason of this failure, which I found somewhat funny.

192.168.255.*** - - [01/Jan/2004 02:00:22] "POST /************************* HTTP/1.1" 500 -
Traceback (most recent call last):
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1836, in __call__
    return self.wsgi_app(environ, start_response)
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1820, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1403, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1479, in full_dispatch_request
    response = self.process_response(response)
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1693, in process_response
    self.save_session(ctx.session, response)
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 837, in save_session
    return self.session_interface.save_session(self, session, response)
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/sessions.py", line 326, in save_session
    val = self.get_signing_serializer(app).dumps(dict(session))
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 569, in dumps
    rv = self.make_signer(salt).sign(payload)
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 412, in sign
    timestamp = base64_encode(int_to_bytes(self.get_timestamp()))
  File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 220, in int_to_bytes
    assert num >= 0

Ok, so normally I'm reading tracebacks in a bottom-up fashion. After looking at the last item in the traceback I instantly felt that something really weird is happening. Name of the file wasn't making things less weird as well. Items above the last one are not telling much more, so I had to dive in into source code to look for other hints.

Function int_to_bytes, at the very beginning, asserts that the input number is positive. Quite normal thing to do, right? So I started a debugger to check what value it got instead. I hunched that it would be -1 or something like that. Dark clouds were already in my mind at the time. I was even afraid that the hardware itself does not support some feature. That's not rare in embedded world, after all.

With the debugger I was able to retrieve value passed in. Apparently it was -220917729. Immediately I realized that I'm facing some issues with the time, and Python interpreter itself is not to be blamed. So I jumped into get_timestamp to see how the timestamp is created.

def get_timestamp(self):
    """Returns the current timestamp.  This implementation returns the
    seconds since 1/1/2011.  The function must return an integer.
    """
    return int(time.time() - EPOCH)

I double-checked that time.time() gives positive number. So after factoring out call to time() was obvious that the EPOCH is making the result negative. Then I needed to assemble two facts: EPOCH in this function is set to 1/1/2011 and the target device starts with time set to 1/1/2004. This yielded negative timestamp and a spectacular failure in result.

So I find this one funny, because apparently the authors didn't expect that the system time could be set to 2004-something :) Perhaps they assumed that it's not possible to run with a date earlier than the day when the library was published :)

Nevertheless I was expecting errors to appear in my configuration. You don't deploy web applications on embedded devices that often. Frankly speaking I encountered other funny problems during the day as well. One of them was with nginx, and if the time permits I'll write post about it as well. Stay tuned ;]


Note: I intentionally skipped description of internal Flask organization. In short words Flask uses ItsDangerous package to have signed session storage at the client-side. You can read more about it here.

Video: Check if number is a palindrome in Python

So few months ago I posted this post about finding whether given number is a palindrome in Python using as little characters as possible. Out of the blue the team behind Webucator made great video basing on my blog post. To be honest I didn't anticipate such a thing and I'm very happy that it happened.

The video itself is excellent so I encourage you to watch it.

On their site you can find more videos. I believe it's also worth to mention that they offer pretty comprehensive Python classes.