tag:blogger.com,1999:blog-70309306384130708412024-03-14T00:44:15.539-07:00Programming soupSławomir Zborowskibjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.comBlogger55125tag:blogger.com,1999:blog-7030930638413070841.post-72503111466650481812019-08-29T06:53:00.002-07:002019-08-29T06:53:22.688-07:00Blog moved to slawomir.netBlog has been moved to <a href="http://slawomir.net/">slawomir.net</a>.bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-85367436245764945802018-10-03T06:55:00.002-07:002018-10-03T07:08:23.670-07:00HexIT Escape Room for IT geeks - escape if you can (Wrocław, Poland)<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhL1vciSzEW7F1alqbNSueCFlfmwkmSIPzhXttkzHBLlnPd7mDiMy6Zor-DwFhPFVdENyR7ysmpuJqIe1q35Y92FVIGM_fLCs3QIoUvfspiNMJVIoR7Wz3UhXvnK64Nx2RdyCpnK_bZI1E/s1600/hexit1.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="517" data-original-width="775" height="212" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhL1vciSzEW7F1alqbNSueCFlfmwkmSIPzhXttkzHBLlnPd7mDiMy6Zor-DwFhPFVdENyR7ysmpuJqIe1q35Y92FVIGM_fLCs3QIoUvfspiNMJVIoR7Wz3UhXvnK64Nx2RdyCpnK_bZI1E/s320/hexit1.jpg" width="320" /></a>I'm glad to announce that we have launched an escape room that targets IT people (developers, testers etc). So far it works well for one month and about 30 teams (3-4 people) have already enjoyed it.<br />
<br />
Please yourself and pay us a visit! Basing on the reactions of other teams I can guarantee remarkable experience. You don't need to be a "hackerman" to complete the room, but if you are you will do so faster ;-). Teams can be mixed too (but at least one person with basic programming skills is rather required).<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-K8hS0ggFWcAWrxzR9jmsUuaZUNZlkZwtV1KIeMq0HND3msyPMaF65FUnAVZWS-CRUw7PJdCM8lUsAF0Zip_9YKO4sMfXXKGT2tcAaerSHXeLXnKEUkD0lcOIU12uOclofaPXbeHT-14/s1600/hexit2.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="1440" data-original-width="960" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-K8hS0ggFWcAWrxzR9jmsUuaZUNZlkZwtV1KIeMq0HND3msyPMaF65FUnAVZWS-CRUw7PJdCM8lUsAF0Zip_9YKO4sMfXXKGT2tcAaerSHXeLXnKEUkD0lcOIU12uOclofaPXbeHT-14/s320/hexit2.jpg" width="213" /></a></div>
<br />
<b>Room location & partnership with Let Me Out.</b><br />
<br />
ul. Bernardyńska 4 (close to Galeria Dominikańska), IInd floor (map below)<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><img border="0" data-original-height="658" data-original-width="981" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLKlLnQxOnkyfLSbPuiOn5fjnVvsiHMl6xHsLx43jpxEySR42fTZ57FbCLefg13qv2wS7zlr4BPuGlHc2zXkzUK_nx5ER6_VHqZPqI7FOK2muUUSsBrU9qFQG-GdpXgnye2daGzXvhaKY/s320/hexit-map.png" style="margin-left: auto; margin-right: auto;" width="320" /></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><a href="https://www.google.com/maps/place/Bernardy%C5%84ska+4,+11-400+Wroc%C5%82aw,+Polska/@51.1101787,17.0294945,15.5z/data=!4m5!3m4!1s0x470fc277e01358ed:0x494fb3fbd9f272a!8m2!3d51.1104811!4d17.0413637" target="_blank">Link to Google Maps</a></td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<br />
<span style="font-size: large;"><span style="background-color: yellow;">Book here: <a href="http://letmeout.pl/">letmeout.pl</a><span style="background-color: white;"> (select Wrocław)</span></span></span><br />
<br />
I'm the author and creator of the room but I'm not running the business. The company that operates the room is Let Me Out and has excellent portfolio of other escape rooms in lot of Polish cities and Brussels.<br />
<br />
<b>Room theme</b><br />
It goes like this: <br />
<i>Another country is trying to become an atomic superpower through the
development of nuclear weapons, which consequently results in the
destabilization of the region and the escalation of the international
conflict on an unprecedented scale. The world is on the verge of the
outbreak of World War III. The only salvation is to infect the secret
plantation of uranium treatment with a computer virus. Will a group of
programmers be able to prevent nuclear war in 90 minutes?</i> <br />
<br />
I received some suggestions that the room itself should be marketed as "an ordinary escape room with extra IT riddles". And this is actually what I wanted to build. Not to give a desk, PC and Jira for the players, but give them a nice mix of good background story with many different IT riddles.<br />
Solve the riddles while saving the world! :-)<br />
<br />
<b>Easter Egg</b><br />
In the room there are some easter eggs. One of them will let you listen to some famous song. The code is what comes out of `1900 + 80 + 9` and you need to properly enter it. You'll know where once you're there ;)<br />
<br />
<b>Room name</b><br />
Funny fact about the name: it incorporates four things:<br />
<ul>
<li>Hex, as a reference to hexadecimal numbers (you'll see some of them ;-)</li>
<li>Hex, as a uranium industry jargon name for <a href="https://en.wikipedia.org/wiki/Uranium_hexafluoride" target="_blank">Uranium Hexafluoride</a></li>
<li>Exit, related to Escape word</li>
<li>IT - information technology</li>
</ul>
bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-51235886986713360992018-08-17T03:56:00.000-07:002018-08-17T03:56:32.180-07:00Global app variables in connexion & aiohttp<br />
tl;dr: use <span style="font-family: "Courier New", Courier, monospace;">pass_context_arg_name</span> and <span style="font-family: "Courier New", Courier, monospace;">api.subapp</span><br />
<br />
Nowadays microservice architecture seem to be the default way distributed applications are build. Also, people started to treat APIs as a first-class citizen. Hence, it's no surprise that projects like <a href="https://github.com/OAI/OpenAPI-Specification" target="_blank">Swagger/OpenAPI</a> are gaining popularity on a daily basis.<br />
<br />
One of Python OpenAPI implementations that I discovered recently is <a href="https://github.com/zalando/connexion/" target="_blank">Connexion</a>. Advantages of using OpenAPI are obvious: e.g. you can decouple endpoints schema from app logic and have only single place where whole API is described. Even the fact that there's Swagger UI for API users can be quite beneficial.<br />
<br />
In the past I've been looking at different frameworks like django-rest, but nothing seemed as simple as Connexion. I decided to play it with right after discovering that the guys from Zalando added support for aiohttp (asynchronous HTTP server) - the framework we use extensively in our projects.<br />
<br />
So what's the problem? What this post is about? Although Connexion is great, it is undocumented (or my DuckDuckGo-foo sucks and this is in fact just not well-documented) how to glue it with how global variables are handled in aiohttp - using app as an container for globals. Consider following snippet:<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;"><span style="color: #cc0000;">async</span><b> </b><span style="color: #cc0000;">def</span> handler(request):</span><br />
<span style="font-family: "Courier New", Courier, monospace;"> <span style="color: blue;"># this is how aiohttp creators recommend to access global variables</span></span><br />
<span style="font-family: "Courier New", Courier, monospace;"> <span style="color: blue;"># e.g. database handle</span></span><br />
<span style="font-family: "Courier New", Courier, monospace;"> request.app[<span style="color: magenta;">'redis_con'</span>].incr(<span style="color: magenta;">'visits'</span>)</span><br />
<span style="font-family: "Courier New", Courier, monospace;"> <span style="color: #cc0000;">return</span> web.Response(body=<span style="color: magenta;">b'hello'</span>)</span><br />
<br />
Nothing much more than ordinary aiohttp handler that uses <span style="font-family: "Courier New", Courier, monospace;">redis_con</span> global. Unfortunately using globals with Connexion is not that straightforward. Example how Connexion handlers look like (following comes from Connextion docs):<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;"><span style="color: #cc0000;">def</span> example(name: str) -> str:</span><br />
<span style="font-family: "Courier New", Courier, monospace;"> <span style="color: #cc0000;">return</span> <span style="color: magenta;">'Hello {name}'</span>.format(name=name) </span><br />
<br />
<table class="highlight tab-size js-file-line-container" data-tab-size="8"><tbody>
<tr><td class="blob-code blob-code-inner js-file-line" id="LC6"><br /></td><td class="blob-code blob-code-inner js-file-line" id="LC6"><br /></td></tr>
<tr></tr>
</tbody></table>
There's no request parameter! It took me some time to find out how to let Connexion pass request (aiohttp context) to handlers. I had to dig into source code to figure out following:<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;"><span style="color: #cc0000;">def</span> start(redis_con):</span><br />
<span style="font-family: "Courier New", Courier, monospace;"> app = connexion.AioHttpApp(__name__, specification_dir=<span style="color: magenta;">'swagger/'</span>)<br /> api = app.add_api(<span style="color: magenta;">'api.yml'</span>, pass_context_arg_name=<span style="color: magenta;">'request'</span>)<br /> api.subapp[<span style="color: magenta;">'redis_con'</span>] = redis_con<br /> app.run()</span><br />
<br />
We're passing <span style="font-family: "Courier New", Courier, monospace;">pass_context_arg_name</span> parameter and it turns out that for aiohttp the context is the request. The unintuive thing is that subapp part. We need to use it in order to set global. This part I have found in <span style="font-family: "Courier New", Courier, monospace;">aiohttp_jinja2.setup</span> function. Now, we can use it in handlers like following.<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;"><span style="color: #cc0000;">async def</span> handler(*args, **kwargs):</span><br />
<span style="font-family: "Courier New", Courier, monospace;"> kwargs[<span style="color: magenta;">'request'</span>].app[<span style="color: magenta;">'redis_con'</span>].incr(<span style="color: magenta;">'visits'</span>)</span><br />
<span style="font-family: "Courier New", Courier, monospace;"> <span style="color: #cc0000;">return</span> web.Response(body=<span style="color: magenta;">b'hello'</span>) </span><br />
<br />
That's all. Seems like easy thing, but nowhere online could I find it. bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-13099749070714393402018-07-21T13:47:00.001-07:002018-07-21T13:50:20.762-07:00Handling multiple identical USB ethernet adapters (Raspberry PI, udev)You have to build simple ethernet-connected chain of devices and continuously check that it's healthy. In order to save money and time you decide to replace individual devices (say Raspberries) with multiple USB ethernet adapters. You buy Chinese ones. What could go wrong?<br />
<br />
We're building an <a href="https://en.wikipedia.org/wiki/Escape_room" target="_blank">escape room</a>. There's plenty of them in Wrocław but our is special, because it's dedicated for IT guys. Random people would have lot of trouble solving even first riddles. These riddles are supposed to be great fun for tech people.<br />
<br />
I don't want to spoil what are the riddles. Let us stay with the technical problem that I had at hand. Multiple devices need to be accessible in some specific configuration to solve one of the riddles. It made no sense to have these devices if their only purpose was to respond to some ICMP packet (certainly there is even more low-level solution, but we need something easy and reliable now). We decided to limit number of these and to attach USB ethernet adapters to each. My colleague has bought some Chinese adapters like on picture below and problems emerged immediately.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgf2AYJaIXtP6hthZioAeA7_yhnAMlgrCK4Y5pICqWQljMZ5Vt_ZoQMHjHoaVDgMSrtcRqGqBiwWHVwLoLO9AJf6FcQfraZMr9iEABBna4xo8GRP7uo4AIEg_tTKmaOJIKCCH3-TkgzBnw/s1600/20180717_223254.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgf2AYJaIXtP6hthZioAeA7_yhnAMlgrCK4Y5pICqWQljMZ5Vt_ZoQMHjHoaVDgMSrtcRqGqBiwWHVwLoLO9AJf6FcQfraZMr9iEABBna4xo8GRP7uo4AIEg_tTKmaOJIKCCH3-TkgzBnw/s320/20180717_223254.jpg" width="240" /></a></div>
<br />
BTW the funny fact about CE marks on some devices (I'm not sure about this one) may not actually be CE marks but "China Export". You can read more about it <a href="https://www.ybw.com/vhf-marine-radio-guide/warning-dont-get-confused-between-the-ce-mark-and-the-china-export-mark-4607" target="_blank">here</a>.<br />
<br />
<b>Perfect hardware clones!</b><br />
<br />
So what's the problem? Well... when I firstly plugged in first adapted I made some configuration changes in Raspbian and was happy that everything works flawlessly. However, couple of days after I connected second adapter to the same device and it was the time when the problem surfaced. All of these USB adapters had the same MAC address. To make it even worse, after inspecting what's in /sys, I was sure that all of the USB parameters are also identical. In other words these devices were perfect clones. ROM was the same for all of them! And btw one out of 8 was not working at all.<br />
<br />
Why this is a problem? It's because if the names are the same, kernel will rename network interface name to something like rename{number} and there's no reliable way to tell which interface is connected to which cable. Sadly, they also share the same MAC, so if you connect all adapters to the same switch, funny things will start to happen!<br />
<br />
<b>U<strike>boot</strike>dev for the rescue</b><br />
<br />
I'm not that into Linux, but I immediately knew where to look for - udev. I was afraid that there won't be a way to differentiate between adapters at udev level and I was right.<br />
<br />
However, some silly (maybe not silly. If something is silly but it works it means it's not silly ;-) solution is possible: differentiate USB ports rather than the devices themselves.<br />
<br />
I started to read documentation and have found that you can create rules based on ports, like following:<br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">SUBSYSTEM=="net", KERNELS=="1-2:1.0", ATTR{address}=="00:e0:4c:53:44:58"</span></span><br />
<br />
net is the subsystem we want. USB port must be provided in KERNELS parameter (S at the end is both intentional and crucial). By providing address attribute you may further target only these Chinese adapters you have on the desk.<br />
<br />
Finding out usb ports proved to be a little tricky task. You can do it using udevadm utility.<br />
I have prepared diagram for my RPi 3:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-xfJ-j6XfaB08srxul-dMTyQzBk4XWXEKt8-LETileN2J3eWph6oVw5KoalMgcjT1X7m-Ebz1zx-gmbhR8vMQvMypj4qM5HNIcQu3hLZkorc7X_2FJku4eXATfbxyS0-JdNZBqouu74Y/s1600/rpi-usb-ports.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="186" data-original-width="566" height="131" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-xfJ-j6XfaB08srxul-dMTyQzBk4XWXEKt8-LETileN2J3eWph6oVw5KoalMgcjT1X7m-Ebz1zx-gmbhR8vMQvMypj4qM5HNIcQu3hLZkorc7X_2FJku4eXATfbxyS0-JdNZBqouu74Y/s400/rpi-usb-ports.png" width="400" /></a></div>
<br />
<br />
Please take note that this may be different in your case. The reason is that it all depends on:<br />
<ul>
<li>hardware revision</li>
<li>firmware versions</li>
<li>kernel version</li>
<li>kernel modules version </li>
</ul>
Once we know these USB "addresses" we can write rules. Rules are below. I'd like to additionally emphasize two things:<br />
<ul>
<li>you can target using <span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">ATTR{address}=="mac-here"</span></span>, but apparently there's no way to change it (<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">ATTR{address}="new-mac"</span></span> doesn't work)</li>
<li>changing MAC address is still possible (e.g. <span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">ifconfig <ifname> hw ether ...</span></span>) and you can even use the name you set, but you must use absolute paths to executable!</li>
</ul>
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">SUBSYSTEM=="net", KERNELS=="1-1.2:1.0", ATTR{address}=="00:e0:4c:53:44:58", NAME="kabelek1", RUN+="/sbin/ifconfig kabelek1 hw ether 00:e0:4c:00:00:01"<br />SUBSYSTEM=="net", KERNELS=="1-1.4:1.0", ATTR{address}=="00:e0:4c:53:44:58", NAME="kabelek2", RUN+="/sbin/ifconfig kabelek2 hw ether 00:e0:4c:00:00:02"<br />SUBSYSTEM=="net", KERNELS=="1-1.3:1.0", ATTR{address}=="00:e0:4c:53:44:58", NAME="kabelek3", RUN+="/sbin/ifconfig kabelek3 hw ether 00:e0:4c:00:00:03"<br />SUBSYSTEM=="net", KERNELS=="1-1.5:1.0", ATTR{address}=="00:e0:4c:53:44:58", NAME="kabelek4", RUN+="/sbin/ifconfig kabelek4 hw ether 00:e0:4c:00:00:04"</span></span><br />
<br />
And voile-a! You are free to connect lot of adapters to single Raspberry. You still need to maintain USB-port and Ethernet cables coupling and also you will need to do something with the cables ;)<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFogbTVxQ5BlCWiURnTaT03UrxdWtqLuEjhnxWAtYibjQvnBYtF2RhhCumF1Ufa-Df4Wm7_7et92tERgfoi9l0u1PP-dPK-Ye2V_1-94mfzpcOgw110JnLgdz74TJm6gWJopgeTDgJ6iU/s1600/20180717_223305.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1200" data-original-width="1600" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFogbTVxQ5BlCWiURnTaT03UrxdWtqLuEjhnxWAtYibjQvnBYtF2RhhCumF1Ufa-Df4Wm7_7et92tERgfoi9l0u1PP-dPK-Ye2V_1-94mfzpcOgw110JnLgdz74TJm6gWJopgeTDgJ6iU/s320/20180717_223305.jpg" width="320" /></a></div>
<br />
This is how my desk looked like when I was figuring things out.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEin8meX6_KfKYi3IL4fr8Zr83XrDw3UgVrH2-_BGPJcrqVMFDQh7O8alD2W0VXPRrXE-Gxd0dHms02QGqQf_Y-EkZdt2tqu33fIvlKNRaNVu4SropFZqyHWzTeRidoUxLbm0dgtVjmdK1E/s1600/20180621_002046.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEin8meX6_KfKYi3IL4fr8Zr83XrDw3UgVrH2-_BGPJcrqVMFDQh7O8alD2W0VXPRrXE-Gxd0dHms02QGqQf_Y-EkZdt2tqu33fIvlKNRaNVu4SropFZqyHWzTeRidoUxLbm0dgtVjmdK1E/s400/20180621_002046.jpg" width="300" /></a></div>
To summarize, almost everything can be done and if something really can't, then you somehow can circumvent. However I believe this trick is just palliative. Chinese adapters can backfire any time, so if you require reliability, then you should look for other hardware.<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-11051119546433526602018-01-02T23:38:00.002-08:002018-01-05T11:39:59.778-08:00Preconfigured Jenkins cluster in Docker Swarm (proxy, accounts, plugins)In recent years lot of popular technologies were adjusted so they can run in Docker containers. Our industry even coined new verb - dockerization. When something is dockerized we usually expect it to behave like self-contained app that is controlled with either command line switches or environment variables. We also assume that apart of this kind of customization the dockerized thing is zero-conf - it will start right away with no further magic spells.<br />
<br />
It's just awesome when things work that way. Unfortunately there are exceptions and Jenkins is one of them. The problem with Jenkins is that even when you start it from within a container, you still need to:<br />
<ul>
<li>open configuration wizard (it's a web page) </li>
<li>prove that you're the guy: pass it's challenge by reading some magic file and pasting its content into configuration wizard</li>
<li>configure proxy, if you're behind one</li>
<li>select plugins to be installed during initialization</li>
<li>setup admin account </li>
</ul>
Pretty bad. It resembles installation wizard like in Windows. Phew. Couple of weeks ago I was trying to check out how well Jenkins would solve one of our data transformation (ETL) problem and was unsure how many times it will be deployed. Hence I needed to do something about this installation process so it sucks less. All of the building blocks were already on the table: Terraform, Ansible and Docker Swarm. The missing part was pre-configured dockerized Jenkins running in the Swarm.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5rhdX4aWg2UoGk0CzpW1IOp9tHVgu-QxLm-8g_dqFspVmxZ3Dvrxwyg1ryKOEw-L-DraLDtDWB8GQio2YCmr0ozPL5YlAGXaMvhMSfri7qH_COq6TtQOfIsPU-KegVULAxX4sKLH581Y/s1600/satanspbeach.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="1181" data-original-width="767" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5rhdX4aWg2UoGk0CzpW1IOp9tHVgu-QxLm-8g_dqFspVmxZ3Dvrxwyg1ryKOEw-L-DraLDtDWB8GQio2YCmr0ozPL5YlAGXaMvhMSfri7qH_COq6TtQOfIsPU-KegVULAxX4sKLH581Y/s320/satanspbeach.jpg" width="207" /></a>So this post, in DuckDuckGo-friendly list, explains how to:<br />
<ul>
<li>pre-configure Jenkins with custom user (admin) account</li>
<li>pre-configure Jenkins with a proxy</li>
<li>pre-configure Jenkins with specified plugins</li>
<li>run Jenkins master and slaves entirely in Docker Swarm with Jenkins' own Swarm plugin for automatic master-slave connection establishment</li>
<li>allow Jenkins jobs to execute other Docker containers nearby (the daemon's sock trick)</li>
</ul>
<br />
<br />
<br />
<br />
<br />
<span style="font-size: xx-small;"><span style="color: #999999;"> http://www.rustypants.net/wp-content/uploads/2008/10/satanspbeach.jpg</span></span><br />
<br />
<br />
<h4>
Abandon all hope, ye who enter here.</h4>
I remember that in one of C projects (<strike>not sure what was it, but perhaps something from GNU, maybe RMS</strike> update: <a href="https://gist.github.com/danmilon/4719562" target="_blank">it was xterm</a>) there was this comment "abandon all hope, ye who enter here". <strike>It also mentioned how many people have ignored this warning and tried to refactor something.</strike> I have the same reflections w.r.t. configuring Jenkins without custom Groovy scripts. I was reluctant to learn new language, but eventually this seemed like the most reasonable way to continue.<br />
<br />
Of course, all of following problems can be solved in a troglodyte way too. E.g. you can configure by hand, extract Jenkins home directory, targz it and re-use. But that brings couple of other problems. Also, surprisingly fresh Jenkins home weighted about 70MBs in my case. I always thought that it's just bunch of XML files, but perhaps it's not that straightforward. Since primitive solutions didn't work right away, I decided to stop for a while and try to solve the problem "the right way".<br />
<br />
<h4>
System overview & requirements.</h4>
System is simple: there's one master (and it's an brilliant example of a SPOF, but nobody cares, since you're unsure of future) and number of workers (slaves). We want workers to register to the master automatically. Unfortunately this is not possible using plain JNLP solution, because you need to register the worker in master prior to establishing a link. In theory you could do some <span style="font-family: "courier new" , "courier" , monospace;">curl</span> magic, but fortunately there's a plugin that does it for you - Jenkins Swarm (not to be confused with Docker Swarm, as it has literally nothing to do with it). Jenkins Swarm plugin consists of two things: a plugin for master Jenkins and Java JAR for slaves.<br />
So we're set up. Jenkins Swarm will take care of auto-connecting slaves. Now, we must run dockerized version of these slaves and put it to Docker Swarm. But before we talk about slaves, let's handle the master.<br />
<br />
<h4>
Jenkins master with plugins, proxy, and extra configuration.</h4>
Let me paste Dockerfile and explain it line by line.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>FROM</b></span> jenkins/jenkins:2.89.1-alpine</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>ARG</b></span> proxy</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>ENV</b></span> http_proxy=$proxy https_proxy=$proxy</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>USER</b></span> root</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>RUN</b></span> apk update && apk add python3</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>COPY</b></span> requirements.txt /tmp/requirements.txt</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>RUN</b></span> pip3 install -r /tmp/requirements.txt</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>USER</b></span> jenkins</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>COPY</b></span> plugins.txt /plugins.txt</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>RUN</b></span> /usr/local/bin/install-plugins.sh swarm:3.6 workflow-aggregator:2.5</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>ENV</b></span> JAVA_OPTS=<span style="color: red;">"-Djenkins.install.runSetupWizard=false"</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>COPY</b></span> security.groovy /usr/share/jenkins/ref/init.groovy.d/security.groovy</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>COPY</b></span> proxy.groovy /usr/share/jenkins/ref/init.groovy.d/proxy.groovy</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>COPY</b></span> executors.groovy /usr/share/jenkins/ref/init.groovy.d/executors.groovy</span><br />
<br />
We must start with some Jenkins image in order to customize it. In my case that's slim Alpine Linux version 2.89.1. Then there's build argument for the proxy. You can ignore this part if you're not behind one.<br />
<br />
Before we modify the image, we need to switch to root user. After we're done we should switch it back to jenkins fo better security (if you wonder how to check it without base image Dockerfile, <span style="font-family: "courier new" , "courier" , monospace;">docker history</span> command is your friend). In my case I'm also installing some <span style="font-family: "courier new" , "courier" , monospace;">python3</span> stuff defined in <span style="font-family: "courier new" , "courier" , monospace;">requirements.txt</span> dependency file. If you're not willing to add any package to the system, you can skip this entire part too.<br />
<br />
Then, we approach configuring plugins. In different places in Internet you can find an advice to use <span style="font-family: "courier new" , "courier" , monospace;">/usr/local/bin/plugins.sh</span> but believe me you don't want to do this, as this installs plugins without their dependencies. Newer <span style="font-family: "courier new" , "courier" , monospace;">install-plugins.sh</span> script takes care of dependencies for you. In our case we're installing two plugins. You might want to install just the essential one - the swarm plugin.<br />
<br />
Now, four nonstandard lines. I believe that setting <span style="font-family: "courier new" , "courier" , monospace;">runSetupWizard</span> to <span style="font-family: "courier new" , "courier" , monospace;">false</span> is self-explanatory. The rest of lines are there for account setup, proxy configuration and executors configuration.<br />
<br />
Let's start with setting up admin account. Groovy here we go! <br />
<br />
<span style="color: magenta;"><b><span style="font-family: "courier new" , "courier" , monospace;">#!groovy</span></b></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>import</b></span> jenkins.model.*</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>import</b></span> hudson.security.*</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>import</b></span> jenkins.security.s2m.AdminWhitelistRule</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>def</b></span> instance = Jenkins.getInstance()</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>def</b></span> user = <span style="color: purple;"><b>new</b></span> File(<span style="color: red;">"/run/secrets/jenkinsUser"</span>).text.trim()</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>def</b></span> pass = <span style="color: purple;"><b>new</b></span> File(<span style="color: red;">"/run/secrets/jenkinsPassword"</span>).text.trim()</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>def</b></span> hudsonRealm = <span style="color: purple;"><b>new</b></span> HudsonPrivateSecurityRealm(<span style="color: magenta;">false</span>)</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">hudsonRealm.createAccount(user, pass)</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">instance.setSecurityRealm(hudsonRealm)</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>def</b></span> strategy = <span style="color: purple;"><b>new</b></span> FullControlOnceLoggedInAuthorizationStrategy()</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">instance.setAuthorizationStrategy(strategy)</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">instance.save()</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">Jenkins.instance.getInjector().getInstance(AdminWhitelistRule.class).setMasterKillSwitch(<span style="color: magenta;">false</span>)</span><br />
<br />
I'm not Groovy expert so don't judge me by the code above. I have started with just knowledge that it runs over JVM :). It's actually looks like nice managed language. The good part is that, as in Python, the code mostly speaks for itself. Hudson Legacy is visible here as well. I won't go into details - if you want to know from where all of this magic comes, pay a visit to <a href="http://javadoc.jenkins.io/" target="_blank">official docs</a>. Don't forget that you can also use infamous Jenkins console. I found Groovy's <span style="font-family: "courier new" , "courier" , monospace;">dump</span> built-in very helpful too.<br />
So the above script will actually setup an admin account, but doesn't hardwire anything. Both username and password come from <a href="https://docs.docker.com/engine/swarm/secrets/" target="_blank">Docker Secrets</a> that allows you to manage sensitive data in your Swarm cluster nicely.<br />
<br />
Now, the second script is for proxy:<br />
<br />
<span style="color: magenta;"><b><span style="font-family: "courier new" , "courier" , monospace;">#!groovy</span></b></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>import</b></span> jenkins.model.*</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>import</b></span> hudson.*</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>def</b></span> instance = Jenkins.getInstance()</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>def</b></span> pc = <span style="color: purple;"><b>new</b></span> hudson.ProxyConfiguration(<span style="color: red;">"1.2.3.4"</span>, <span style="color: magenta;">8080</span>, <span style="color: magenta;">null</span>, <span style="color: magenta;">null</span>, <span style="color: red;">"localhost,*.your.intranet.com"</span>);</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">instance.proxy = pc;</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">instance.save()</span><br />
<br />
Here's some magic too. It sets up proxy <span style="font-family: "courier new" , "courier" , monospace;">1.2.3.4:8080</span> but with specified exceptions. Then it modifies Jenkins instance (which seem to be a singleton).<br />
<br />
And finally, executors part. I wanted this one so master is not used as a worker at all.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>import</b></span> jenkins.model.*</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">Jenkins.instance.setNumExecutors(<span style="color: magenta;">0</span>)</span><br />
<br />
<h4>
Slaves.</h4>
Now, since the master is ready, let's configure slaves. Their Dockerfile is as follows.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>FROM</b></span> docker:17.03-rc</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>ARG</b></span> proxy</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>ENV</b></span> https_proxy=$proxy http_proxy=$proxy no_proxy=<span style="color: red;">"localhost,*.your.intranet.com"</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>RUN</b></span> apk --update add openjdk8-jre git python3</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>RUN</b></span> wget -O swarm-client.jar http://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/swarm-client/3.3/swarm-client-3.3.jar</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>ENV</b></span> http_proxy= https_proxy=</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>COPY</b></span> entrypoint.sh /</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>RUN</b></span> chmod +x /entrypoint.sh</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>CMD</b></span> [<span style="color: red;">"/entrypoint.sh"</span>]</span><br />
<br />
This time base image is docker, because we want to have docker installed within this docker container (so this container can spawn other containers). After setting proxies (the part that is not mandatory) we must download Java Runtime Environment version 8 and download swarm-client JAR. I'm using version 3.3 which is accessible through URL as for today.<br />
Finally, there's an entrypoint that will execute swarm-client and do all the magic, but it heavily relies on Docker Secret named <span style="font-family: "courier new" , "courier" , monospace;">jenkinsSwarm</span>, which should look like following.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">-master http://master_address:8080 -password jenkinsUser -username jenkinsPassword</span><br />
<br />
<br />
Here master_address must be known to slave machines (e.g. in <span style="font-family: "courier new" , "courier" , monospace;">/etc/hosts</span>, Consul or something). You should also include username and password - the same ones that you share in other Docker Swarm secrets.<br />
<br />
If you're using Ansible like I do, it's pretty straightforward to utilize variables instead not to hardcode credentials. For instance <span style="font-family: "courier new" , "courier" , monospace;">ansible-vault</span> can be used for this.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">entrypoint.sh</span> itself is almost one-liner:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>mkdir</b></span> /tmp/jenkins</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: purple;"><b>java</b></span> -jar swarm-client.jar -labels=docker -executors=1 -fsroot=/tmp/jenkins -name=docker-<span style="color: red;">$(hostname)</span> <span style="color: red;">$(cat /run/secrets/jenkinsSwarm)</span></span><br />
<br />
It assumes that it's running in the Swarm and can access <span style="font-family: "courier new" , "courier" , monospace;">/run/secrets/jenkinsSwarm</span> (the line that's pasted above).<br />
<br />
<h4>
Glueing it all together.</h4>
Building blocks are already in place. Now it's time to glue everything together. I don't want to go into details here, because this is not primary topic of this blog post. If you're interested in how personally I did everything please let me know in comments, so I will create GitHub repo. Let me however give you some important hints:<br />
<ul>
<li>if you want slave to be able to spawn other containers (on the same host on which the slave is running), you must bind mount <span style="font-family: "courier new" , "courier" , monospace;">docker.sock</span> file, e.g. like this: <span style="color: red;"><span style="font-family: "courier new" , "courier" , monospace;">"/var/run/docker.sock:/var/run/docker.sock"</span></span>. There's more to this, though! Docker daemon will not allow <span style="font-family: "courier new" , "courier" , monospace;">jenkins</span> user to spawn containers, so you must somehow circumvent this problem. I'm circumventing this by adding <span style="font-family: "courier new" , "courier" , monospace;">jenkins</span> user to docker group, but this works only because there's 1:1 mapping between the host and container.</li>
<li>you should have three secrets in Docker Swarm cluster: <span style="font-family: "courier new" , "courier" , monospace;">jenkinsUser</span>, <span style="font-family: "courier new" , "courier" , monospace;">jenkinsPassword</span> and <span style="font-family: "courier new" , "courier" , monospace;">jenkinsSwarm</span> with username, password, and swarm-client.jar arguments respectively</li>
<li>machines must be able to communicate. For internal JNLP communication, port <span style="color: magenta;"><span style="font-family: "courier new" , "courier" , monospace;">50000/tcp</span></span> must be opened.</li>
<li>if you set deployment mode to global in <span style="font-family: "courier new" , "courier" , monospace;">docker-compose.yml</span> file (if you're using one), then you will have as much slaves as machines in the cluster, which can be nice</li>
<li>if you're gonna stick to this solution for a longer period of time I recommend to think about horizontally scaling out and in: it should be as simple as adding/removing machines from the cluster: just one <span style="font-family: "courier new" , "courier" , monospace;">terraform</span> command followed by <span style="font-family: "courier new" , "courier" , monospace;">ansible-playbook</span> spell.</li>
</ul>
<br />
<br />
Hopefully this post helps you with setting up Jenkins cluster that simply works. If you'd like to see the code, let me know in comments!bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-78832027659084491972017-12-08T13:36:00.000-08:002017-12-08T13:36:14.292-08:00Airflow Docker with Xcom push and pullRecently, in one projects I'm working on, we started to research technologies that can be used to design and execute data processing flows. Amount of data to be processed is counted in terabytes, hence we were aiming at solutions that can be deployed in the cloud. Solutions from Apache umbrella like Hadoop, Spark, or Flink were at the table from the very beginning, but we also looked at others like Luigi or Airflow, because our use case was neither MapReducable nor stream-based.<br />
<br />
Airflow caught our attention and we decided to give it a shot just to see if we can create PoC using it*. In order to execute PoC faster rather than slower, we planned to provision Swarm cluster for this.<br />
<br />
In the Airflow you can find couple of so-called operators that allow you to execute actions. There are operators for Bash or Python, but you can also find something for e.g. Hive. Fortunately there is also Docker operator for us.<br />
<br />
<b>Local PoC</b><br />
PoC started on my laptop and not in the cluster. Thankfully, DockerOperator allows you to pass URL to docker daemon, so moving from laptop to cluster is close to just changing one parameter. Nice! <br />
<br />
If you want to run Airflow server locally from inside container, and have it running as non-root (you should!) and you bind docker.sock from host to the container, you must create docker group in the container that mirrors docker group on your host and then add e.g. airflow user to this group. That does the trick...<br />
<br />
So just running DockerOperator is not black magic. However, if your containers need to exchange data it starts to be a little bit more tricky.<br />
<br />
<b>Xcom push/pull</b><br />
The push part is simple and documented. Just set <span style="font-family: "Courier New", Courier, monospace;">xcom_push</span> parameter to <span style="font-family: "Courier New", Courier, monospace;">True</span> and last line of container stdout will be published by Airflow as it was pushed programatically. It looks that this is natural Airflow way.<br />
<br />
Pull is not that obvious. Perhaps because it's not documented. You can't read stdin or something. The way to do this involves joining two dots:<br />
<ul>
<li>command parameter can be Jinja2-templated</li>
<li>one of the macros allows you to do xcom_pull </li>
</ul>
So you need to prepare your containers in a special way so they can pull/push. Let's start with a container that pushes something:<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;"><span style="color: purple;">FROM</span> debian<br /><span style="color: purple;">ENTRYPOINT</span> echo <span style="color: red;">'{"i_am_pushing": "json"}'</span></span><br />
<br />
Simple enough. Now pulling container:<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;"><span style="color: purple;">FROM</span> debian<br /><span style="color: purple;">COPY</span> ./entrypoint /<br /><span style="color: purple;">ENTRYPOINT</span> [<span style="color: red;">"/entrypoint"</span>]</span><br />
<br />
Entrypoint script can be whatever and will get the JSON as <span style="font-family: "Courier New", Courier, monospace;">$1</span>. Crucial (and also easy to miss) thing that is required for it to work is that <span style="font-family: "Courier New", Courier, monospace;">ENTRYPOINT</span> must use exec form. Yes, there are two forms of <span style="font-family: "Courier New", Courier, monospace;">ENTRYPOINT</span>. If you use the one without array, then parameters will not be passed to the container!<br />
<br />
Finally, you can glue things together and you're done. The <span style="font-family: "Courier New", Courier, monospace;">ti</span> macro allows us to get data pushed by other task. <span style="font-family: "Courier New", Courier, monospace;">ti</span> stands for <span style="font-family: "Courier New", Courier, monospace;">task_instance</span>.<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;">dag = DAG(<span style="color: red;">'docker'</span>, default_args=default_args, schedule_interval=timedelta(<span style="color: purple;">1</span>))<br /><br />t1 = DockerOperator(task_id=<span style="color: red;">'docker_1'</span>, dag=dag, image=<span style="color: red;">'docker_1'</span>, xcom_push=<span style="color: purple;">True</span>)<br /><br />t2 = DockerOperator(task_id=<span style="color: red;">'docker_2'</span>, dag=dag, image=<span style="color: red;">'docker_2'</span>, command=<span style="color: red;">'{{ ti.xcom_pull(task_ids="docker_1") }}'</span>)<br /><br />t2.set_upstream(t1)</span><br />
<br />
<br />
<b>Conclusion</b><br />
Docker can be used in Airflow along with Xcom push/pull functionality. It isn't very convenient and is not well documented I would say, but at least it works. <br />
<br />
If time permits I'm going to create PR for documenting pull op. I don't know how it works out, because in Airflow GH project there are 237 PRs now and some of them are there since May 2016!<br />
<br />
<br />
* the funny thing is that we considered Jenkins too! ;-)bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-88471712290218784182017-11-29T17:47:00.000-08:002017-11-29T17:51:02.854-08:00Tests stability S09E11 (Docker, Selenium)<br />
If you're experienced in setting up automated testing with Selenium and Docker you'll perhaps agree with me that it's not the most stable thing in the world. Actually it's far far away from any stable island - right in the middle of "the sea of instability".<br />
<br />
When you think about failures in automated testing and how they develop when the system is growing it can resemble drugs. Seriously. When you start, occasional failures are ignored. You close your eyes and click "Retry". Innocent. But after some time it snowballs into a problem. And you find yourself with a blind fold put on but you can't remember buying it.<br />
<br />
This post is small story how in one of small projects we started with occasional failures and ended up with... well... you'll see. Read on ;).<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2KnNtxdkg3YISgQxS2xEUk5peAwWNPf_M7gQ2csfXj7kpWiEoS7F4WPzbdc2Fsz9gy6A7u5SWW8xqemCtLQQ3PFBOf07a_xhhWwnMx6X-2ILJ7IImSGVy7_DX5CU4XD9NlBt9s95wTtc/s1600/2881603057_820af9d26a.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="278" data-original-width="318" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2KnNtxdkg3YISgQxS2xEUk5peAwWNPf_M7gQ2csfXj7kpWiEoS7F4WPzbdc2Fsz9gy6A7u5SWW8xqemCtLQQ3PFBOf07a_xhhWwnMx6X-2ILJ7IImSGVy7_DX5CU4XD9NlBt9s95wTtc/s1600/2881603057_820af9d26a.jpg" /></a></div>
<br />
<br />
For past couple of months I was thinking that "others have worse setups and live", but today it all culminated, I have achieved fourth degree of density and decided to stop being quiet.<br />
<br />
<b>Disclaimer</b><br />
In the middle of this post you might start to think that our environment is simply broken. That's damn right. The cloud in which we're running is not very stable. Sometimes it behaves like it had a sulk. There are problems with proxies too. And finally we add Docker and Selenium to the mixture. I think testimonial from one of our engineers sums it all:<br />
<blockquote class="tr_bq">
<span lang="en-PH"><span style="font-family: "calibri" , sans-serif; font-size: x-small;"><span style="font-size: 11pt;">if retry didn’t fix it for the <span class="currentHitHighlight" id="0.5176511097110247" name="searchHitInReadingPane">10</span><sup>th</sup> time, then there’s definitely something wrong</span></span></span></blockquote>
And now something must be noted as well. The project I'm referring to is just a sideway one. It's an attempt to innovate some process, unsupported by the business whatsoever.<br />
<br />
<b>The triggers</b><br />
I was pressing "Retry" button for another time on two of the e2e jobs and saw following.<br />
<br />
<span style="font-size: small;"><span style="font-family: "courier new" , "courier" , monospace;">// job 1<br />couldn't stat /proc/self/fd/18446744073709551615: stat /proc/self/fd/23: no such file or directory<br /><br />// job 2<br />Service 'frontend' failed to build: readlink /proc/4304/exe: no such file or directory</span></span><br />
<br />
What the hell is this? We have never seen this before and now apparently it became a commonplace in our CI pipeline (it was nth retry).<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnckJgpDarMx8n0ursR__vG_0TfwNLDJvH-PlPToaAaaFQLI5hh79OeouhHEWjS_PtnezWNbfzX7DKBcs_W9vJlzEtG7JeZWLJrlfAXUntID1wKYAp5yNrQHcKH5yn4GgPfUHJ9BEYI3E/s1600/Mad_scientist.svg.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1589" data-original-width="1094" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnckJgpDarMx8n0ursR__vG_0TfwNLDJvH-PlPToaAaaFQLI5hh79OeouhHEWjS_PtnezWNbfzX7DKBcs_W9vJlzEtG7JeZWLJrlfAXUntID1wKYAp5yNrQHcKH5yn4GgPfUHJ9BEYI3E/s400/Mad_scientist.svg.png" width="275" /></a></div>
<br />
So this big number after /fd/ is 64-bit value of -1. Perhaps something in Selenium uses some function that returns an error and then tries to call stat syscall, passing -1 as an argument. Function return value was not checked!<br />
The second error message is most probably related to docker. Something tries to find where is executable for some PID. Why?<br />
<br />
"Retry" solution did not work this time. Re-deploying e2e workers also didn't help. I thought that now is the time when we should get some insights into what is actually happening and how many failures were caused by unstable environment.<br />
<br />
Luckily we're running on GitLab, which provides reasonable API. Read on to see what I've found. I personally find it hilarious.<br />
<br />
<b>Insight into failures</b><br />
It's extremely easy to make use of GitLab CI API (thanks GitLab guys!). I have extracted JSON objects for every job in every pipeline that was recorded in our project and started playing with the data.<br />
<br />
The first thing that I checked was how many failures there are per particular type of test. Names are anonymized a little because I'm unsure if this is sensitive data or not. Better safe than sorry!<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivysGp7zZ0KOt37VnlNcc79O0QTd3RlT5iZ7lOcXerIhzmbY4EWE2I7T6hcmrGLf-RyTjJSca8q5xuSWC2oSL3dX1hSEMpMdd6MJMdbpe57m3PqCI9Dr-CHjmZPrq8Z6InSz_BFR_RFb4/s1600/Figure_1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="872" data-original-width="1600" height="347" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivysGp7zZ0KOt37VnlNcc79O0QTd3RlT5iZ7lOcXerIhzmbY4EWE2I7T6hcmrGLf-RyTjJSca8q5xuSWC2oSL3dX1hSEMpMdd6MJMdbpe57m3PqCI9Dr-CHjmZPrq8Z6InSz_BFR_RFb4/s640/Figure_1.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig 1: Successful/failed jobs, per job name</td></tr>
</tbody></table>
I knew that some tests were failing often, but these results tell that in some cases almost 50% of the jobs fail! Insane! BTW we recently split some of long-running e2e test suites into smaller jobs, which is observable from the figure.<br />
But now we can argue that maybe this is because of the bugs in the code. Let's see. In order to tell this we must analyze data basing on commit hashes: how many commits in particular jobs were executed multiple times and finished with different status. In other words: we look for the situations in which even without changes in the code the job status was varying.<br />
<br />
The numbers for our repository are:<br />
<ul>
<li>number of (commit, job) pairs with at least one success: <b>23550</b></li>
<li>total number of failures for these pairs: <b>1484</b></li>
</ul>
<br />
In other words, unstable environment was responsible for at least ~<b>6.30%</b> of observable failures. It might look like small number, but if you take into account that single job can last for 45 minutes, it becomes a lot of wasted time. Especially that failure notifications aren't always handled immediately. I also have a hunch that at some time people started to click "Retry" just to be sure the problem is not with the environment.<br />
<br />
My top 5 picks among all of these failures are below.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">hash:job | #tot | success/fail | users clicking "Retry"</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">----------------------------------------------------------------</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">d7f43f9c:e2e-7 | 19 | ( 1/17) | user-6,user-7,user-5<br />2fcecb7c:e2e-7 | 16 | ( 8/ 8) | user-6,user-7<br />2c34596f:other-1 | 14 | ( 1/13) | user-8<br />525203c6:other-13 | 12 | ( 1/ 8) | user-13,user-11<br />3457fbc5:e2e-6 | 11 | ( 2/ 9) | user-14</span><br />
<br />
So, for instance - commit d7f43f9c was failing on job e2e-7 17 times and three distinct users tried to make it pass by clicking "Retry" button over and over. And finally they made it! Ridiculous, isn't it?<br />
<br />
And speaking of time I've also checked jobs that lasted for enormous number of time. Winners are:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">job:status | time (hours)</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">---------------------------------</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">other-2:failed | 167.30<br />other-8:canceled | 118.89<br />other-4:canceled | 27.19<br />e2e-7:success | 26.12<br />other-1:failed | 26.01</span><br />
<br />
Perhaps these are just outliers. Histograms would give better insight. But even if outliers, they're crazy outliers.<br />
<br />
<br />
I have also attempted to detect reason of the failure but this is more complex problem to solve. It requires to parse logs and guess which line was the first one indicating error condition. Then the second guess - about whether the problem originated from environment or the code.<br />
Maybe such a task could be somehow handled by (in)famous machine learning. Actually there are more items that could be achieved with ML support. Two most simple examples are:<br />
<ul>
<li>giving estimation whether the job will fail</li>
<ul>
<li>also, providing reason of failure</li>
<li>if the failure originated from faulty environment, what exactly was it? </li>
</ul>
<li>estimated time for the pipeline to finish</li>
<li>auto-retry in case of env-related failure</li>
</ul>
<br />
<b>Conclusions</b><br />
Apparently I've been having much more unstable e2e test environment than I ever thought. Lesson learned is that if you get used to solve problem by retrying you loose sense in how big trouble you are.<br />
<br />
Similarly to any other engineering problem you first need to gather data and decide what to do next. Basing on numbers I have now I'm planning to implement some ideas to make life easier.<br />
<br />
While analyzing the data I had moments when I couldn't stop laughing to myself. But the reality is sad. It started with occasional failures and ended with continuous problem. And we weren't doing much about it. The problem was not that we were effed in the ass. The problem was that we started to arrange our place there. Insights will help us get out.<br />
<br />
Share your ideas in comments. If we bootstrap discussion I'll do my best to share the code I have in GitHub.bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-27999810219156372932017-04-25T15:15:00.001-07:002017-04-25T15:15:20.794-07:00C++: on the dollar signIn most programming languages there are sane rules that specify what can be an identifier and what cannot. Most of the time it's even intuitive - it's just something that matches <span style="font-family: "Courier New",Courier,monospace;">[_a-zA-Z][a-zA-Z0-9]*</span>. There are languages that allow more (e.g. $ in PHP/JS, or <a href="http://www.originlab.com/doc/LabTalk/guide/String-registers" target="_blank">% in LabTalk</a>). How about C++? Answer to this question may be a little surprise.<br />
<br />
Almost a year ago we had this little argument with friend of mine whether
dollar sign is allowed to be used within C++ identifiers. In other
words it was about whether e.g. <span style="background-color: #eeeeee; font-family: "courier new" , "courier" , monospace;">int $this = 1;</span> is legal C++ or not.<br />
Basically
I was stating that's not possible. On the other hand, my friend was
recalling some friend of his, which mentioned that dollars are fine.<br />
<br />
The first line of defense is of course nearest compiler. I decided to fire up one and simply check what happens if I compile following fragment of code.<br />
<br />
<pre id="vimCodeElement" style="background-color: seashell; font-size: 13px; white-space: pre-wrap;"><span class="LineNr" id="L1" style="background-color: lavenderblush; color: #4d4d4d; font-size: 1em; padding-bottom: 1px;">1 </span><span class="Type" style="color: seagreen; font-size: 1em; font-weight: bold;">auto</span> $foo() {
<span class="LineNr" id="L2" style="background-color: lavenderblush; color: #4d4d4d; font-size: 1em; padding-bottom: 1px;">2 </span> <span class="Type" style="color: seagreen; font-size: 1em; font-weight: bold;">int</span> $bar = <span class="Constant" style="color: deeppink; font-size: 1em;">1</span>;
<span class="LineNr" id="L3" style="background-color: lavenderblush; color: #4d4d4d; font-size: 1em; padding-bottom: 1px;">3 </span> <span class="Statement" style="color: brown; font-size: 1em; font-weight: bold;">return</span> $bar;
<span class="LineNr" id="L4" style="background-color: lavenderblush; color: #4d4d4d; font-size: 1em; padding-bottom: 1px;">4 </span>}</pre>
<br />
At the time I had gcc-4.9.3 installed on my system (prehistoric version, I know ;-). For the record, the command was like this: <span style="background-color: #f3f3f3; font-family: "courier new" , "courier" , monospace;">g++ dollar.cpp -std=c++1y -c -Wall -Wextra -Werror</span>.<br />
<br />
And to my surprise... it compiled without single complaint. Moreover, clang and MSVC gulped this down without complaining as well. Well, Sławek - I said to myself - even if you're mastering something for years, there's still much to surprise you. BTW such a conclusion puts titles like following in much funnier light.<br />
<br />
<div style="text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLJiteM2mYdjBsjLT57HKj-0MfhJIrbmaly2KZoTNRZp9E5EW7vqKGw2TOZ_53jUYBiN6eBPhdnlfgJWjOinP56T7u1AS-7-mNmUfNdGJ3gnlb3o932O7Yik-tQuCQtfygzbJj1RRlQe8/s1600/41Cozm-LkhL._SX333_BO1%252C204%252C203%252C200_.jpg" imageanchor="1"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLJiteM2mYdjBsjLT57HKj-0MfhJIrbmaly2KZoTNRZp9E5EW7vqKGw2TOZ_53jUYBiN6eBPhdnlfgJWjOinP56T7u1AS-7-mNmUfNdGJ3gnlb3o932O7Yik-tQuCQtfygzbJj1RRlQe8/s200/41Cozm-LkhL._SX333_BO1%252C204%252C203%252C200_.jpg" width="134" /></a></div>
<br />
It was normal office day and we had other work to get done, so I reluctantly accepted this just as another dark corner. After couple of hours I forgot about the situation and let it resurface... couple of weeks later.<br />
<br />
So, fast forward couple of weeks. I was preparing something related to C++ and I accidentally found a reference to the dollar sign in GCC documentation. It was nice feeling, because I knew I will fill this hole in my knowledge in a matter of minutes. So what was the reason compilers were happily accepting dollar signs? <br />
Let me put here excerpt from GCC documentation, which speaks for itself :)<br />
<blockquote class="tr_bq">
<blockquote class="tr_bq">
<i>GCC allows the ‘<samp>$</samp>’ character in identifiers as an extension for most targets. This is true regardless of the <samp>std=</samp> switch, since this extension cannot conflict with standards-conforming programs. When preprocessing assembler, however, dollars are not identifier characters by default.</i><br />
<i>Currently the targets that by default do not permit ‘<samp>$</samp>’ are AVR, IP2K, MMIX, MIPS Irix 3, ARM aout, and PowerPC targets for the AIX operating system.</i><br />
<i>You can override the default with <samp>-fdollars-in-identifiers</samp> or <samp>fno-dollars-in-identifiers</samp>. See <a href="https://gcc.gnu.org/onlinedocs/cpp/fdollars-in-identifiers.html#fdollars-in-identifiers">fdollars-in-identifiers</a>.</i></blockquote>
</blockquote>
<br />
I think three most important things are:<br />
<ol>
<li>This ain't work in macros.</li>
<li>It doesn't seem to be correlated with -std switch.</li>
<li>Some architectures do not permit it at all.</li>
</ol>
What got me thinking it this list of architectures. And it took me couple of minutes to find out that e.g. assembler for ARM doesn't allow dollar sign. So any assembly code generated by GCC for ARM would not assemble if dollar sign was used. That's plausible explanation why GCC doesn't allow such a character for all architectures. It doesn't explain why compilers allow it for others, though.<br />
<br />
GCC could theoretically mitigate problem with particular architectures by replacing $ signs with some other character, but then bunch of other problems would appear: possible name conflicts, name mangling/demangling would yield incorrect values, and finally it wouldn't be possible to export such "changed" symbols from a library. In other words: disaster.<br />
<br />
What about the standard?<br />
<br />
After thinking about it for a minute I had strong need to see what exactly identifier does mean. So I opened <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pdf" target="_blank">N3797</a> and quickly found section I was looking for, namely (surprise-surprise) <i>2.11 Identifiers</i>. So what does this section say?<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggasctCfRkAB3fQPVipV-GnAKFt6x7RpmIcteN1u3bePyut7trEttMz6SKdtOFJOOuo6zlMNVnlgP6LmOqvtQkJnDdjNOCik0qbjQwpBO-EDSAuPU5r1tLBztsVc2vDmij-b3uRDbF9dI/s1600/identifiers.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggasctCfRkAB3fQPVipV-GnAKFt6x7RpmIcteN1u3bePyut7trEttMz6SKdtOFJOOuo6zlMNVnlgP6LmOqvtQkJnDdjNOCik0qbjQwpBO-EDSAuPU5r1tLBztsVc2vDmij-b3uRDbF9dI/s400/identifiers.png" width="321" /></a></div>
<br />
<br />
Right after formal definition there is an explanation which refers to sections E.1 and E.2. But that's not important here. There is one more thing that appears in the formal definition and it's extremely easy to miss this one. It's "other implementation-defined characters". What does it mean? Yup - the compiler is allowed to allow any other character to be used within identifiers at will.<br />
<br />
P.s. surprisingly cppcheck 1.71 doesn't report $ sign in identifiers as a problem at all.bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-74395358668987232922017-01-06T10:28:00.001-08:002017-01-07T15:43:04.297-08:00Getting all parent directories of a path<span style="color: #e06666;">edit: reddit updates</span> <br />
<br />
Few minutes ago I needed to solve trivial problem of getting all parent directories of a path. It's very easy to do it imperatively, but it would simply not satisfy me. Hence, I challenged myself to do it declaratively in Python.<br />
<br />
The problem is simple, but let me put an example on the table, so it's even easier to imagine what are we talking about.<br />
<br />
Given some path, e.g.<br />
<br />
<span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">/home/szborows/code/the-best-project-in-the-world</span></span><br />
<br />
You want to have following list of parents:<br />
<br />
<span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">/home/szborows/code</span></span><br />
<span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">/home/szborows</span></span><br />
<span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">/home</span></span><br />
<span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">/</span></span><br />
<br />
It's trivial to do this using <span style="font-family: "courier new" , "courier" , monospace;">split</span> and then some for loop. How to make it more declarative?<br />
Thinking more mathematically (mathematicians will perhaps cry out to heaven for vengeance after reading on, but let me at least try...) we simply want to get all of the subsets from some ordered set S that form prefix w.r.t. S. So we can simply generate pairs of numbers <span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">(1, y)</span></span>, representing all prefixes where y belongs to <span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">[1, len S)</span></span><span style="font-family: inherit;">. We can actually ignore this constant 1 and just operate on numbers.</span><br />
<span style="font-family: inherit;">In Python, to generate numbers starting from len(path) and going down we can simply utilize <span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">range()</span></span> and <span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">[::-1]</span></span> (this reverses collections, it's an idiom). Then <span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">join()</span></span> can be used on splited path, but with slicing from 1 to y. That's it. And now demonstration:</span><br />
<br />
<span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">>>> path = '/a/b/c/d' </span></span><br />
<span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">>>> <b>['/' + '/'.join(path.split('/')[1:l]) for l in range(len(path.split('/')))[::-1] if l]</b></span></span><br />
<span style="color: purple;"><span style="font-family: "courier new" , "courier" , monospace;">['/a/b/c', '/a/b', '/a', '/']</span></span><br />
<br />
But what about performance? Which one will be faster - imperative or declarative approach? Intuition suggests that imperative version will win, but let's check.<br />
<br />
On picture below you can see timeit (n=1000000) results for my machine (i5-6200U, Python 3.5.2+) for three paths:<br />
<br />
<pre style="background: #ffffff; color: black;">short_path <span style="color: #808030;">=</span> <span style="color: #0000e6;">'/lel'</span>
regular_path <span style="color: #808030;">=</span> <span style="color: #0000e6;">'/jakie/jest/srednie/zagniezdzenie?'</span>
long_path <span style="color: #808030;">=</span> <span style="color: #0000e6;">'/z/lekka/dlugasna/sciezka/co/by/pierdzielnik/mial/troche/roboty'</span></pre>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-ZH4O0-nnK2kZFyWjzxztoUpMp6YmuIqp__nBLM8v2ADDKhb1o_0teUIXmG-iS_TCPvJ-vQOCiypNw9eFDGOdgC24PlUI2AeNSkCuUaV_EogO4dCVuEXJ6JvNbHQhy6HMLIha24lhCDY/s1600/Rplots.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-ZH4O0-nnK2kZFyWjzxztoUpMp6YmuIqp__nBLM8v2ADDKhb1o_0teUIXmG-iS_TCPvJ-vQOCiypNw9eFDGOdgC24PlUI2AeNSkCuUaV_EogO4dCVuEXJ6JvNbHQhy6HMLIha24lhCDY/s400/Rplots.png" width="400" /></a></div>
Implementations used:<br />
<br />
<pre style="background: #ffffff; color: black;"><span style="color: maroon; font-weight: bold;">def</span> imper1<span style="color: #808030;">(</span>path<span style="color: #808030;">)</span><span style="color: #808030;">:</span>
result <span style="color: #808030;">=</span> <span style="color: #808030;">[</span><span style="color: #808030;">]</span>
<span style="color: maroon; font-weight: bold;">for</span> i <span style="color: maroon; font-weight: bold;">in</span> <span style="color: #400000;">range</span><span style="color: #808030;">(</span><span style="color: #008c00;">1</span><span style="color: #808030;">,</span> <span style="color: #400000;">len</span><span style="color: #808030;">(</span>path<span style="color: #808030;">.</span>split<span style="color: #808030;">(</span><span style="color: #0000e6;">'/'</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">:</span>
y <span style="color: #808030;">=</span> <span style="color: #0000e6;">'/'</span><span style="color: #808030;">.</span>join<span style="color: #808030;">(</span>path<span style="color: #808030;">.</span>split<span style="color: #808030;">(</span><span style="color: #0000e6;">'/'</span><span style="color: #808030;">)</span><span style="color: #808030;">[</span><span style="color: #808030;">:</span>i<span style="color: #808030;">]</span><span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">or</span> <span style="color: #0000e6;">'/'</span>
result<span style="color: #808030;">.</span>append<span style="color: #808030;">(</span>y<span style="color: #808030;">)</span>
<span style="color: maroon; font-weight: bold;">return</span> result
<span style="color: maroon; font-weight: bold;">def</span> imper2<span style="color: #808030;">(</span>path<span style="color: #808030;">)</span><span style="color: #808030;">:</span>
i <span style="color: #808030;">=</span> <span style="color: #400000;">len</span><span style="color: #808030;">(</span>path<span style="color: #808030;">)</span> <span style="color: #44aadd;">-</span> <span style="color: #008c00;">1</span>
l <span style="color: #808030;">=</span> <span style="color: #808030;">[</span><span style="color: #808030;">]</span>
<span style="color: maroon; font-weight: bold;">while</span> i <span style="color: #44aadd;">></span> <span style="color: #008c00;">0</span><span style="color: #808030;">:</span>
<span style="color: maroon; font-weight: bold;">while</span> i <span style="color: #44aadd;">!=</span> <span style="color: #008c00;">0</span> <span style="color: maroon; font-weight: bold;">and</span> path<span style="color: #808030;">[</span>i<span style="color: #808030;">]</span> <span style="color: #44aadd;">!=</span> <span style="color: #0000e6;">'/'</span><span style="color: #808030;">:</span>
i <span style="color: #44aadd;">-</span><span style="color: #808030;">=</span> <span style="color: #008c00;">1</span>
l<span style="color: #808030;">.</span>append<span style="color: #808030;">(</span>path<span style="color: #808030;">[</span><span style="color: #808030;">:</span>i<span style="color: #808030;">]</span> <span style="color: maroon; font-weight: bold;">or</span> <span style="color: #0000e6;">'/'</span><span style="color: #808030;">)</span>
i <span style="color: #44aadd;">-</span><span style="color: #808030;">=</span> <span style="color: #008c00;">1</span>
<span style="color: maroon; font-weight: bold;">return</span> l
<span style="color: maroon; font-weight: bold;">def</span> decl1<span style="color: #808030;">(</span>path<span style="color: #808030;">)</span><span style="color: #808030;">:</span>
<span style="color: maroon; font-weight: bold;">return</span> <span style="color: #808030;">[</span><span style="color: #0000e6;">'/'</span> <span style="color: #44aadd;">+</span> <span style="color: #0000e6;">'/'</span><span style="color: #808030;">.</span>join<span style="color: #808030;">(</span>path<span style="color: #808030;">.</span>split<span style="color: #808030;">(</span><span style="color: #0000e6;">'/'</span><span style="color: #808030;">)</span><span style="color: #808030;">[</span><span style="color: #008c00;">1</span><span style="color: #808030;">:</span>l<span style="color: #808030;">]</span><span style="color: #808030;">)</span></pre>
<pre style="background: #ffffff; color: black;"><span style="color: maroon; font-weight: bold;"> for</span> l <span style="color: maroon; font-weight: bold;">in</span> <span style="color: #400000;">range</span><span style="color: #808030;">(</span><span style="color: #400000;">len</span><span style="color: #808030;">(</span>path<span style="color: #808030;">.</span>split<span style="color: #808030;">(</span><span style="color: #0000e6;">'/'</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">[</span><span style="color: #808030;">:</span><span style="color: #808030;">:</span><span style="color: #44aadd;">-</span><span style="color: #008c00;">1</span><span style="color: #808030;">]</span> <span style="color: maroon; font-weight: bold;">if</span> l<span style="color: #808030;">]</span>
<span style="color: maroon; font-weight: bold;">def</span> decl2<span style="color: #808030;">(</span>path<span style="color: #808030;">)</span><span style="color: #808030;">:</span>
<span style="color: maroon; font-weight: bold;">return</span> <span style="color: #808030;">[</span><span style="color: #0000e6;">'/'</span> <span style="color: #44aadd;">+</span> <span style="color: #0000e6;">'/'</span><span style="color: #808030;">.</span>join<span style="color: #808030;">(</span>path<span style="color: #808030;">.</span>split<span style="color: #808030;">(</span><span style="color: #0000e6;">'/'</span><span style="color: #808030;">)</span><span style="color: #808030;">[</span><span style="color: #008c00;">1</span><span style="color: #808030;">:</span><span style="color: #44aadd;">-</span>l<span style="color: #808030;">]</span><span style="color: #808030;">)</span></pre>
<pre style="background: #ffffff; color: black;"><span style="color: maroon; font-weight: bold;"> for</span> l <span style="color: maroon; font-weight: bold;">in</span> <span style="color: #400000;">range</span><span style="color: #808030;">(</span><span style="color: #44aadd;">-</span><span style="color: #400000;">len</span><span style="color: #808030;">(</span>path<span style="color: #808030;">.</span>split<span style="color: #808030;">(</span><span style="color: #0000e6;">'/'</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #44aadd;">+</span><span style="color: #008c00;">1</span><span style="color: #808030;">,</span> <span style="color: #008c00;">1</span><span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">if</span> l<span style="color: #808030;">]</span> </pre>
<pre style="background: #ffffff; color: black;"> </pre>
<pre style="background: #ffffff; color: black;"># decl3 hidden. read on ;-)</pre>
<br />
<br />
It started with imper1 and decl1. I noticed that imperative version is faster. I tried to speed up declarative function by replacing [::-1] with some numbers tricks. It helped, but not to the extend I anticipated. Then, I though about speeding up imper1 by using lower-level constructs. Unsurprisingly while loops and checks were faster. Let me temporarily ignore decl3 for now and play a little with CPython bytecode.<br />
<br />
By looking at my results not everything is so obvious. decl{1,2} turned out to have decent performance with 4-part path, which looks like reasonable average.<br />
<br />
I disassembled decl1 and decl2 to see the difference in byte code. The diff is shown below.<br />
<span style="font-size: xx-small;"><br /></span>
<span style="color: purple;"><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">30 CALL_FUNCTION 1 (1 positional, 0 keyword pair) | 30 CALL_FUNCTION 1 (1 positional, 0 keyword pair)<br />33 CALL_FUNCTION 1 (1 positional, 0 keyword pair) | 33 CALL_FUNCTION 1 (1 positional, 0 keyword pair)<br />36 CALL_FUNCTION 1 (1 positional, 0 keyword pair) | 36 UNARY_NEGATIVE<br />39 LOAD_CONST 0 (None) | 37 LOAD_CONST 4 (1) <br />42 LOAD_CONST 0 (None) | 40 BINARY_ADD<br />45 LOAD_CONST 5 (-1) | 41 LOAD_CONST 4 (1) <br />48 BUILD_SLICE 3 | 44 CALL_FUNCTION 2 (2 positional, 0 keyword pair)<br />51 BINARY_SUBSCR</span></span></span><br />
<br />
<br />
As we can see [::-1] is implemented as three loads and build slice operations. I think this could be optimized if we had special opcode like e.g. BUILD_REV_SLICE. My little-optimized decl2 is faster because one UNARY_NEGATIVE and one BINARY_ADD is less than LOAD_CONST, BUILD_SLICE and BINARY_SUBSCR. Performance gain here is pretty obvious. No matter what decl2 must be faster.<br />
<br />
What about decl2 vs imper1?<br />
It's more complicated and it was a little surprise that such a longer bytecode can be slower than shorter counterpart.<br />
<br />
<span style="color: purple;"><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"> 3 0 BUILD_LIST 0 <br /> 3 STORE_FAST 1 (result)<br /> <br /> 4 6 SETUP_LOOP 91 (to 100) <br /> 9 LOAD_GLOBAL 0 (range)<br /> 12 LOAD_CONST 1 (1)<br /> 15 LOAD_GLOBAL 1 (len)<br /> 18 LOAD_FAST 0 (path)<br /> 21 LOAD_ATTR 2 (split) <br /> 24 LOAD_CONST 2 ('/')<br /> 27 CALL_FUNCTION 1 (1 positional, 0 keyword pair)<br /> 30 CALL_FUNCTION 1 (1 positional, 0 keyword pair)<br /> 33 CALL_FUNCTION 2 (2 positional, 0 keyword pair)<br /> 36 GET_ITER <br /> >> 37 FOR_ITER 59 (to 99)<br /> 40 STORE_FAST 2 (i)<br /><br /> 5 43 LOAD_CONST 2 ('/')<br /> 46 LOAD_ATTR 3 (join)<br /> 49 LOAD_FAST 0 (path)<br /> 52 LOAD_ATTR 2 (split)<br /> 55 LOAD_CONST 2 ('/')<br /> 58 CALL_FUNCTION 1 (1 positional, 0 keyword pair)<br /> 61 LOAD_CONST 0 (None)<br /> 64 LOAD_FAST 2 (i)<br /> 67 BUILD_SLICE 2<br /> 70 BINARY_SUBSCR<br /> 71 CALL_FUNCTION 1 (1 positional, 0 keyword pair)<br /> 74 JUMP_IF_TRUE_OR_POP 80<br /> 77 LOAD_CONST 2 ('/')<br /> >> 80 STORE_FAST 3 (y)<br /><br /> 6 83 LOAD_FAST 1 (result)<br /> 86 LOAD_ATTR 4 (append)<br /> 89 LOAD_FAST 3 (y)<br /> 92 CALL_FUNCTION 1 (1 positional, 0 keyword pair)<br /> 95 POP_TOP<br /> 96 JUMP_ABSOLUTE 37<br /> >> 99 POP_BLOCK<br /><br /> 7 >> 100 LOAD_FAST 1 (result)<br /> 103 RETURN_VALUE</span></span></span><br />
<br />
The culprit was LOAD_CONST in decl{1,2} that was loading list-comprehension as a code object. Let's see how it looks, just for the record.<br />
<span style="color: purple;"><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br /></span></span></span>
<span style="color: purple;"><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">>>> dis.dis(decl2.__code__.co_consts[1])<br /> 21 0 BUILD_LIST 0<br /> 3 LOAD_FAST 0 (.0)<br /> >> 6 FOR_ITER 51 (to 60)<br /> 9 STORE_FAST 1 (l)<br /> 12 LOAD_FAST 1 (l)<br /> 15 POP_JUMP_IF_FALSE 6<br /> 18 LOAD_CONST 0 ('/')<br /> 21 LOAD_CONST 0 ('/')<br /> 24 LOAD_ATTR 0 (join)<br /> 27 LOAD_DEREF 0 (path)<br /> 30 LOAD_ATTR 1 (split)<br /> 33 LOAD_CONST 0 ('/')<br /> 36 CALL_FUNCTION 1 (1 positional, 0 keyword pair)<br /> 39 LOAD_CONST 1 (1)<br /> 42 LOAD_FAST 1 (l)<br /> 45 UNARY_NEGATIVE<br /> 46 BUILD_SLICE 2<br /> 49 BINARY_SUBSCR<br /> 50 CALL_FUNCTION 1 (1 positional, 0 keyword pair)<br /> 53 BINARY_ADD<br /> 54 LIST_APPEND 2<br /> 57 JUMP_ABSOLUTE 6<br /> >> 60 RETURN_VALUE</span></span></span><br />
<br />
So this is how list comprehensions look like when converted to byte code. Nice! Now performance results make more sense. In the project I was working on my function for getting all parent paths was called in one place and perhaps contributed to less than 5% of execution time of whole application. It would not make sense to optimize this piece of code. But it was delightful journey into internals of CPython, wasn't it?<br />
<br />
Now, let's get back to decl3. What have I done to make my declarative implementation 2x faster on average case and for right-part outliers? Well... I just reluctantly resigned from putting everything in one line and saved path.split('/') into separate variable. That's it.<br />
<br />
So what are learnings?<br />
<ul>
<li>declarative method turned out to be faster than hand-crafter imperative one employing low-level constructs.<br />Why? Good question! Maybe because bytecode generator knows how to produce optimized code when it encounters list comprehension? But I have written no CPython code, so it's only my speculation.</li>
<li>trying to put everything in one line can hurt - in described case split() function was major performance dragger</li>
</ul>
reddit-related updates:<br />
Dunj3 outpaced me ;) - his implementation, which is better both w.r.t. "declarativeness" and performance: <br />
<pre><code>list(itertools.accumulate(path.split('/'), curry(os.sep.join)))
</code></pre>
<ul>
</ul>
<br />
<span style="color: #999999;"><span style="font-size: xx-small;">syntax highlighting done with https://tohtml.com/python/ </span></span><br />
<ul>
</ul>
bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-5398151699972190852017-01-03T00:02:00.000-08:002017-01-03T00:02:41.904-08:00Logstash + filebeat: Invalid Frame Type, received: 1Post for googlers that stumble on the same issue - it seems that "overconfiguration" is not a great idea for Filebeat and Logstash.<br />
<br />I've decided to explicitly set ssl.verification_mode to none in my Filebeat config and then I got following Filebeat and Logstash errors:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">filebeat_1 | 2017/01/03 07:43:49.136717 single.go:140: ERR Connecting error publishing events (retrying): EOF<br />filebeat_1 | 2017/01/03 07:43:50.152824 single.go:140: ERR Connecting error publishing events (retrying): EOF<br />filebeat_1 | 2017/01/03 07:43:52.157279 single.go:140: ERR Connecting error publishing events (retrying): EOF<br />filebeat_1 | 2017/01/03 07:43:56.173144 single.go:140: ERR Connecting error publishing events (retrying): EOF <br />filebeat_1 | 2017/01/03 07:44:04.189167 single.go:140: ERR Connecting error publishing events (retrying): EOF</span><br />
<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">logstash_1 | 07:42:35.714 [Api Webserver] INFO logstash.agent - Successfully started Logstash API endpoint {:port=>9600} <br />logstash_1 | 07:43:49.135 [nioEventLoopGroup-4-1] ERROR org.logstash.beats.BeatsHandler - Exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 3 <br />logstash_1 | 07:43:49.139 [nioEventLoopGroup-4-1] ERROR org.logstash.beats.BeatsHandler - Exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 1 <br />logstash_1 | 07:43:50.150 [nioEventLoopGroup-4-2] ERROR org.logstash.beats.BeatsHandler - Exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 3 <br />logstash_1 | 07:43:50.154 [nioEventLoopGroup-4-2] ERROR org.logstash.beats.BeatsHandler - Exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 1 <br />logstash_1 | 07:43:52.156 [nioEventLoopGroup-4-3] ERROR org.logstash.beats.BeatsHandler - Exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 3 <br />logstash_1 | 07:43:52.157 [nioEventLoopGroup-4-3] ERROR org.logstash.beats.BeatsHandler - Exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 1 <br />logstash_1 | 07:43:56.170 [nioEventLoopGroup-4-4] ERROR org.logstash.beats.BeatsHandler - Exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 3 <br />logstash_1 | 07:43:56.175 [nioEventLoopGroup-4-4] ERROR org.logstash.beats.BeatsHandler - Exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 1</span><br />
<br />
It seems it's better to stay quiet with Filebeat :) Hopefully this helped to resolve your issue.bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-23198465502041059602016-12-13T14:07:00.001-08:002016-12-13T16:33:53.967-08:00std::queue's big default footprint in assembly codeRecently I've been quite busy and now I'm kind of scrounging back into C++ world. Friend of mine told me about <a href="http://www.includeos.org/" target="_blank">IncludeOS</a> project and I thought that it may be pretty good exercise to put my hands on my keyboard and help in this wonderful project.<br />
<br />
To be honest, the learning curve is quite steep (or I'm getting too old to learn so fast) and I'm still distracted by a lot of other things, so no big deliverables so far... but by just watching discussion on <a href="https://gitter.im/hioa-cs/IncludeOS" target="_blank">Gitter</a> and integrating it with what I know I spotted probably obvious, but a little bit surprising thing about <span style="font-family: "courier new" , "courier" , monospace;">std::queue</span>.<br />
<br />
std::queue is not a container. Wait, what?, you ask. It's a container adapter. It doesn't have implementation. Instead, it takes other implementation, uses it as underlying container and just provides some convenient interface for end-user. By the way it isn't the only one. There are others like <span style="font-family: "courier new" , "courier" , monospace;">std::stack</span> and <span style="font-family: "courier new" , "courier" , monospace;">std::priority_queue</span> to name a few.<br />
<br />
One of the dimension in which C++ shines are options for customizing stuff. We can customize things like memory allocators. In container adapters we can customize this underlying container if we decide that the one chosen by library writers isn't good match for us.<br />
<br />
By default, perhaps because std::queue requires fast access at the beginning and end, it's underlying container is <span style="font-family: "courier new" , "courier" , monospace;">std::deque</span>. <span style="font-family: "courier new" , "courier" , monospace;">std::deque</span> provides O(1) complexity for pushing/popping at both ends. Perfect match, isn't it?<br />
<br />
Well, yes if you care about performance at the cost of increased binary size. As it turns out by simply changing <span style="font-family: "courier new" , "courier" , monospace;">std::deque</span> to <span style="font-family: "courier new" , "courier" , monospace;">std::vector</span>:<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;">std::queue<<span style="color: #38761d;">int</span>> qd; </span></b><br />
<b><span style="font-family: "courier new" , "courier" , monospace;">std::queue<<span style="color: #38761d;">int</span>, std::vector<<span style="color: #38761d;">int</span>>> qv;</span></b><br />
<br />
Generated assembly code for x86-64 clang 3.8 (-O3 -std=c++14) is 502 and 144 respectively.<br />
<br />
I know that in most context binary size is secondary consideration, but still I believe it's an interesting fact that the difference is so big. In other words there must be a lot of things going on under the bonnet of <span style="font-family: "courier new" , "courier" , monospace;">std::deque</span>. I don't recommend changing deque to vector in production - it can seriously damage your performance.<br />
<br />
You can play around with the code here: <a href="https://godbolt.org/g/XaLhS7">https://godbolt.org/g/XaLhS7</a> (code based on <a href="https://github.com/Voultapher" target="_blank">Voultapher</a> example).bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-86877239448217690362016-09-07T14:30:00.000-07:002016-09-07T14:30:12.114-07:00elasticdiffRecently I needed to compare two ElasticSearch indices. To be honest I was pretty sure that I'll find something in the Internet. It was a surprise that no such a tool do exist. I thought that this is good time to pay off portion of my debt to open-source :)<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiT6GoU597vHSwEgVYVQEcZ25ikK9Br6PZi2WCddmQQ92rGUq0ci3phkI_DRPWenoKq-Tz_Adbcb56zPzu3YX5qe6XWSbRoVYwHPXsjtXnohLoTWy1hzcHJhwsu47cPK_7l9SihLYQsQK4/s1600/elasticdiff.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiT6GoU597vHSwEgVYVQEcZ25ikK9Br6PZi2WCddmQQ92rGUq0ci3phkI_DRPWenoKq-Tz_Adbcb56zPzu3YX5qe6XWSbRoVYwHPXsjtXnohLoTWy1hzcHJhwsu47cPK_7l9SihLYQsQK4/s200/elasticdiff.png" width="175" /></a></div>
<b>Enter ElasticDiff</b><br />
<br />
<a href="https://github.com/szborows/elasticdiff" target="_blank">elasticdiff</a> is a simple project hosted on GitHub which allows you to compare two ES indices without pain. It is early development, so don't expect much, but for simple indices it works pretty well.<br />
<br />
Usage is quite trivial:<br />
<span style="background-color: rgba(0, 0, 0, 0.0392157); color: #333333; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 13.6px; line-height: 20.4px;">python3 elasticdiff.py -t docType -i key http://1.2.3.4/index1 http://2.3.4.5/index1</span><br />
<br />
<br />
<br />
Output was designed to imitate diff command:<br />
<pre style="background-color: #f7f7f7; border-radius: 3px; box-sizing: border-box; color: #333333; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 13.6px; font-stretch: normal; line-height: 1.45; overflow: auto; padding: 16px; word-wrap: normal;"><code style="background: transparent; border-radius: 3px; border: 0px; box-sizing: border-box; display: inline; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 13.6px; line-height: inherit; margin: 0px; overflow: visible; padding: 0px; word-break: normal; word-wrap: normal;">only in left: a5
only in left: a1
only in right: a4
only in right: a2
entries for key a3 differ
---
+++
@@ -1,4 +1,4 @@
{
"key": "a3",
- "value": "DEF"
+ "value": "XYZ"
}
Summary:
2 entries only in left index
2 entries only in right index
1 common entries
0 of them are the same
1 of them differ</code></pre>
<br />
Hopefully someone will find this tool useful.<br />
<br />
More information available at <a href="https://github.com/szborows/elasticdiff" target="_blank">GitHub</a> page. I'm planning releasing this on PyPi when it gets more mature. I will update this post when this happens.bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-63980357865804254272016-07-25T17:25:00.001-07:002016-07-25T17:25:20.470-07:00Elasticsearch cluster as Eucalyptus on RHEL 7.2, using AnsibleJust thought that this can be useful for someone else.<br />
<br />
https://github.com/szborows/es-cluster-rhel7-ansible-eucalyptusbjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-51258020748794967072016-05-08T07:08:00.002-07:002016-05-08T07:08:33.957-07:00me @ ACCU 2016 - what every C++ programmer should know about modern compilersIn Poland we have this proverb "pierwsze koty za płoty"*. Basically we say it when we do something for the first time in our lives. After finding a translation, I was quite amused - "the first pancake is always spoiled". Recently I had my debut abroad on ACCU 2016 conference. You decide if it was spoiled pancake or not :)<br />
<br />
My short 15min talk was about modern compilers. I start with basic introduction, which may be a bit surprising, then I move to C++ standard perspective and undefined behaviors, then there's a part about optimizations and finally I'm covering the ecosystem that has grown around compilers in recent years (sanitizers, 3rd tools based on compilers, etc.). Enjoy!<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe width="320" height="266" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/nfDTTxH5DsI/0.jpg" src="https://www.youtube.com/embed/nfDTTxH5DsI?feature=player_embedded" frameborder="0" allowfullscreen></iframe></div>
<br />
Slides: <a href="http://www.slideshare.net/szborows/what-every-c-programmer-should-know-about-modern-compilers-w-comments-accu-2016" target="_blank">with comments</a> / <a href="http://www.slideshare.net/szborows/what-every-c-programmer-should-know-about-modern-compilers-wo-comments-accu-2016" target="_blank">without comments</a><br />
<br />
<b>GCC 6.1 released</b> [<a href="https://gcc.gnu.org/ml/gcc/2016-04/msg00244.html" target="_blank">link</a>]<br />
If GCC was released two weeks earlier I would definitely include one slide about how verbose and helpful compilers are nowadays.<br />
Several years ago they were unhelpful and very quiet - even if something was going terribly wrong (and the programmer would discover this in run-time) they were happily compiling assuming that the programmer knows what she's doing.<br />
And now.. what a change. The compilers became so friendly that GCC 6.1 can even let you know that you have unmerged git content in the file!<br />
<br />
* direct translation doesn't work :) - "first cats over the fences".bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-67076530704352787682016-03-17T17:59:00.000-07:002016-03-17T18:02:47.140-07:00Really simple JSON serverAs some of you may already know there's this <a href="https://github.com/typicode/json-server" target="_blank">json-server</a> project which aims to provide an easy way to create fake servers returning JSON responses over HTTP. This is very great project, but if you need something simpler that does not necessarily follow all of the REST rules, then you're out of luck.<br />
<br />
For instance, if you use json-server with PUT/POST requests then underlying database will change.<br />
<br />
<a href="https://github.com/szborows/really-simple-json-server" target="_blank">really-simple-json-server</a>, on the other hand, is an effort to create really simple JSON server. To actually show you how simple it is, let's have a look at example routes (example.json).<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">{</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> <span style="color: magenta;">"/config"</span>: {<span style="color: magenta;">"avatar_upstream_server"</span>: <span style="color: magenta;">"172.17.42.42"</span>},</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> <span style="color: magenta;">"/u/1/friends"</span>: [<span style="color: magenta;">"slavko"</span>, <span style="color: magenta;">"asia"</span>, <span style="color: magenta;">"agata"</span>, <span style="color: magenta;">"zbyszek"</span>, <span style="color: magenta;">"lucyna"</span>],</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> <span style="color: magenta;">"/u/2/friends"</span>: [<span style="color: magenta;">"adam"</span>, <span style="color: magenta;">"grzegorz"</span>],</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> <span style="color: magenta;">"/u/3/friends"</span>: [],</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> <span style="color: magenta;">"/u/4/friends"</span>: [<span style="color: magenta;">"slavko"</span>]</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">}</span><br />
<br />
Then, assuming all dependencies (see below) are installed it's all about starting a server:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">$ ./server.py --port 1234 example.json</span><br />
<br />
And we can start querying the server!<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">$ curl http://localhost:1234/config</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">{<span style="color: magenta;">"avatar_upstream_server"</span>: <span style="color: magenta;">"172.17.42.42"</span>}</span><br />
<br />
The project uses Python 3.5.1 along with aiohttp package. It is shipped with Docker image, so it's pretty easy to start hacking.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">$ docker build -t szborows/python351_aiohttp .</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">$ docker run -it -v $PWD:/app:ro szborows/python351_aiohttp /bin/bash -c <span style="color: magenta;">"/app/server.py --port 1234 /app/example.json"</span></span>bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com1tag:blogger.com,1999:blog-7030930638413070841.post-42890360117282753032016-03-16T17:34:00.000-07:002016-03-16T17:34:12.760-07:00ElasticSearch AWS cloud plugin problem connecting to Riak S3 endpoint without wildcard certs<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiC77SFHezF3GvJ8Q6wwQbIJi__UEdEZiqXmvveCMrp7Wii9vIoxKPwrtBSn7E_hN8nGfJ6IclxKKkYtC6glGNm8Bt8YZ1yXWSaGa4EQT-kbuB5EacHwNpVJ2sLasbXXNIzyBag45Ho7Ys/s1600/riakes_blog.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="127" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiC77SFHezF3GvJ8Q6wwQbIJi__UEdEZiqXmvveCMrp7Wii9vIoxKPwrtBSn7E_hN8nGfJ6IclxKKkYtC6glGNm8Bt8YZ1yXWSaGa4EQT-kbuB5EacHwNpVJ2sLasbXXNIzyBag45Ho7Ys/s320/riakes_blog.png" width="320" /></a></div>
The other day at work we were trying to use our enterprise installation of Riak S3 (open-source version of AWS S3) to store backups of some ElasticSearch instance. ElasticSearch itself is a mature project so it wasn't a surprise that there's already <a href="https://github.com/elastic/elasticsearch-cloud-aws" target="_blank">plugin</a> for storing snapshots using S3 protocol. So we prepared our query and expected it to work out-of-the-box.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">curl -XPUT <span style="color: magenta;">'http://es.address.here:9200/_snapshot/s3_dev_backup?verify=false'</span> -d <span style="color: magenta;">'{</span></span><br />
<span style="color: magenta; font-family: "courier new" , "courier" , monospace;"> "type": "s3",</span><br />
<span style="color: magenta; font-family: "courier new" , "courier" , monospace;"> "settings": {</span><br />
<span style="color: magenta; font-family: "courier new" , "courier" , monospace;"> "access_key": "*****-***-**********",</span><br />
<span style="color: magenta; font-family: "courier new" , "courier" , monospace;"> "secret_key": "****************************************",</span><br />
<span style="color: magenta; font-family: "courier new" , "courier" , monospace;"> "bucket": "elastic",</span><br />
<span style="color: magenta; font-family: "courier new" , "courier" , monospace;"> "endpoint": "s3_cloud_front.address.here"</span><br />
<span style="color: magenta; font-family: "courier new" , "courier" , monospace;"> }</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: magenta;">}'</span> </span><br />
<br />
As usual, in corporate reality, it didn't work. Instead, we got following exception.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">Error injecting constructor, com.amazonaws.AmazonClientException: Unable to execute HTTP request: hostname in certificate didn't match: <<span style="color: magenta;">elastic.s3_cloud_front.address.here</span>> != <<span style="color: magenta;">s3_cloud_front.address.here</span>> OR <<span style="color: magenta;">s3_cloud_front.address.here</span>></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> at org.elasticsearch.repositories.s3.S3Repository.<init>(Unknown Source)</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> while locating org.elasticsearch.repositories.s3.S3Repository</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> while locating org.elasticsearch.repositories.Repository</span><br />
<br />
So apparently endpoint certificate (self signed one, btw) wasn't prepared in such a way that both e.g. <span style="color: magenta; font-family: "courier new" , "courier" , monospace;">endpoint.address</span> and <span style="color: magenta; font-family: "courier new" , "courier" , monospace;">subdomain.endpoint.address</span> would match. In other words it wasn't wildcard certificate bound to the endpoint domain name.<br />
<br />
We decided to debug with StackOverflow (it's common technique nowadays, isn't it :D?). After trying numerous solutions like <a href="http://stackoverflow.com/questions/4663147/is-there-a-java-setting-for-disabling-certificate-validation" target="_blank">disabling certificate check using java flags</a> or even using so-called java agents to <a href="http://stackoverflow.com/questions/6031258/java-ssl-how-to-disable-hostname-verification" target="_blank">hijack default hostname verifier</a> we ran out of ideas how to elegantly solve the problem. And here comes our ultimate hack - hijacking <a href="https://github.com/apache/httpclient" target="_blank">Apache httpclient</a> library.<br />
<br />
The idea behind the hack is simple - re-compile httpclient library with one small modification - put premature return statement at the very top of hostname verification function. The function, where the change should be made is as follows.<br />
<br />
<span style="color: magenta; font-family: "courier new" , "courier" , monospace;">org.apache.http.conn.ssl.SSLSocketFactory.verifyHostname(SSLSocketFactory.java:561)</span><br />
<br />
One thing to remember is that the return statement must be wrapped with some silly if statement (e.g. <span style="font-family: "courier new" , "courier" , monospace;">if (1 == 1)</span>), so the compiler won't complain about unreachable code below. It's funny that java compiler doesn't eliminate such trivial things, but in this particular scenario it was a feature rather than a bug. If the <span style="font-family: "courier new" , "courier" , monospace;">verifyHostname</span> method wasn't throwing an exception then the modification would be even simpler - just an return statement.<br />
<br />
The change is trivial, so I'm not including it here. The last step is to replace httpclient jar in AWS cloud plugin with raped one and we're done. No more complains about SSL hostnames.bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-6423349804801287012016-03-08T12:40:00.001-08:002016-03-08T13:26:12.258-08:00Mini REST+JSON benchmark: Python 3.5.1 vs Node.js vs C++Some time ago at Nokia I voluntarily developed a search engine tailored for internal resources (Windows shares, intranet sites, ldap directories, etc..) - NSearch. Since then few people helped me to improve it so it became unofficial "search that simply works". However, as you can image, this was purely a side-by project so I didn't pay attention to quality of the code much (I'd love to do so, but cruel time didn't permit :/). As a result during passing months a lot of technical debt was borrowed.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnOXAcjTt914ZFuldBakMsQTbjZ7wym1ZJOLn0wMCOE7ztLJPGT8iqkNqK7O_l9qrqTbvnOXm1MM5P5tlMYYMjuf30XJk9OX8gnG6DZIQ5aKFvvPXTLkhcUe2wTAwGOTLG7j1C6nWf870/s1600/blog_bench1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="120" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnOXAcjTt914ZFuldBakMsQTbjZ7wym1ZJOLn0wMCOE7ztLJPGT8iqkNqK7O_l9qrqTbvnOXm1MM5P5tlMYYMjuf30XJk9OX8gnG6DZIQ5aKFvvPXTLkhcUe2wTAwGOTLG7j1C6nWf870/s400/blog_bench1.png" width="400" /></a></div>
<br />
Recently we came to a conclusion that it's enough. We agreed that the backend of the service is going to be first in line. Because all of it was about to be rewritten we thought that maybe it's a perfect time to evaluate other technologies. Currently it's using Python2.7 + Django + Gunicorn. We consider going to either Node.js with Express 4, Python3 with aiohttp or C++. Maybe other language would be even a better match? However, we don't program in any other languages on a daily basis...<br />
<br />
In this post I'd like to show you results of my very simple evaluation of performance of these three technologies along with some findings.<br />
<br />
<b>Technical facts</b><br />
<br />
<ul>
<li>everything was tested on machine with Intel E5-2680 v3 CPU and 192 GBs of RAM running on RHEL-7.1 OS,</li>
<li>applications were run from under Docker containers, so there might be some overhead introduced by libcontainer, libnetwork, etc.</li>
<li>ab was used for benchmarking with 1M of requests with different concurrency settings</li>
<li>max timeout was set to 1 second</li>
</ul>
<b>Attention:</b> I'm not an expert in performance testing. Thus it's possible that I made some mistake. Source code of all inspected programs is available <span style="background-color: white;"><a href="https://github.com/szborows/nsearch-backend-simple-benchmarks" target="_blank">here</a></span>. I recommend you to have a brief look at it. In case you find anything wrong or suspicious that might have influenced my results please let me know. Thanks.<br />
<br />
I prepared two JSON objects used for two following benchmarks. The first was simple {"hello": "world!"} and the second was extracted directly from NSearch and connsisted of about 10k of characters, but I can't include it here. For the C++ I used <span style="background-color: white;"><a href="https://github.com/Corvusoft/restbed" target="_blank">Restbed</a></span> framework and <span style="background-color: white;"><a href="https://github.com/open-source-parsers/jsoncpp" target="_blank">jsoncpp</a></span> library, just to have anything with normal URL path support (otherwise results wouldn't be reliable at all).<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4P5cDyVbg0843VKPdrtD7wMs1c3l_oV1bOC7LxLAMhyphenhyphenHr_WV007qwkKttx6O9HQPYsmux2OgckPC4iZtiBD7D6N6ErXETFitY7bVQXGnBIqAFjbw182IOQFGI7j7ureC-MZVxKGnWvL0/s1600/results-1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4P5cDyVbg0843VKPdrtD7wMs1c3l_oV1bOC7LxLAMhyphenhyphenHr_WV007qwkKttx6O9HQPYsmux2OgckPC4iZtiBD7D6N6ErXETFitY7bVQXGnBIqAFjbw182IOQFGI7j7ureC-MZVxKGnWvL0/s400/results-1.png" width="400" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFbTs7tWcocI578TXWSVZ7XK07r4giIZkywtDThO9K8dQjYRH1Darehw8DT91hegcTTI-NoamerCrRpc3LvPKCzfcRfHyHm2m6umEQsEy5hgEtOEdbmGuszVcDZk8ASUt_LaB3Y8dNOQg/s1600/results-3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFbTs7tWcocI578TXWSVZ7XK07r4giIZkywtDThO9K8dQjYRH1Darehw8DT91hegcTTI-NoamerCrRpc3LvPKCzfcRfHyHm2m6umEQsEy5hgEtOEdbmGuszVcDZk8ASUt_LaB3Y8dNOQg/s400/results-3.png" width="400" /></a></div>
<br />
<br />
<b>First benchmark - conclusions:</b><br />
<br />
<ul>
<li>all solutions are asynchronous and event based and they're using event loops. otherwise it wouldn't be the case that 512 concurrent users can be served with max timeout set to 1 second</li>
<li>starting from 192 concurrent users amount of requests per second starts to decrease slightly </li>
<li>Python is more than two times slower than Node.js</li>
<li>C++ is more than two times faster than Node.js</li>
</ul>
Quite interesting, isn't it? I was hoping Python 3.5.1 with language async support and it's asyncio module will be faster. I was also anticipating that C++ will be about 30% faster and not 2.x time faster. Again I must admit that the intuition is deceptive.<br />
<br />
<b>Second benchmark - conclusions:</b><br />
(I excluded C++ because of convenience of filling JSON object)<br />
<br />
<ul>
<li>the gap between Python and Node.js is much smaller when bigger JSON is in question</li>
<li>apparently starting to handle a request in Python is slow, at least compared to Node.js</li>
<li>replacing json.dumps with ujson.dumps increases Python performance by about 5%</li>
<li>Node.js performance drops drastically when bigger JSON is used - from over 5k of requests per second to about 1700!</li>
<li>Python's drop is not that drastic - from about 1700 to 1200 requests per second. It means that when the handling is ongoing, Python is not slowing down.</li>
</ul>
<br />
<b>Why Python is so slow and Node.js much faster?</b><br />
Python is interpreted language. This is main reason why it's so slow. Why it's not the case with Node.js? Because Node.js uses V8 - JavaScript interpreter - which has built-in JS JITter. JITter means Just-In-Time compiler which can speed up execution of a program by order of magnitude.<br />
<br />
<b>Can Python be faster? Yes!</b><br />
There's also Python interpreter with built-in JITter - <a href="http://pypy.org/" target="_blank">PyPy</a>. Unfortunately it doesn't support 3.5.1 version of Python yet.<br />
<br />
<br />bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com15tag:blogger.com,1999:blog-7030930638413070841.post-90444567217843389422016-03-07T14:41:00.003-08:002016-03-07T14:44:54.768-08:00CMake disservice: project command resets CMAKE_EXE_LINKER_FLAGS<b>tl;dr at the bottom</b><br />
<b>CMake version: 3.3.2 </b><br />
<br />
I think that everyone can agree that in the world of programming there are a lot of frustrating and irritating things. Some of them are less annoying, some are more and they are appearing basically at random resulting in a waste of time. Personally I think the most depressing and worst thing that one can encounter is implicit and silent weird side effects. I was hit by such a side effect today at work and would like to write about is as well as (possibly) help future Googlers.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4-G6Il2toxkhETrFG61qHR4FdMwKzWPZIw_KTFtE720u7de30h_Vpsdte5p1WmDhmlw-jae_ySCanhMmlrs_JeNWkufdUugTH3ZrYL1MDTL9wK2WsRdrWyIYt1dB1p_u5TZMtkjj20lA/s1600/10284339485_e2fb304b99.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4-G6Il2toxkhETrFG61qHR4FdMwKzWPZIw_KTFtE720u7de30h_Vpsdte5p1WmDhmlw-jae_ySCanhMmlrs_JeNWkufdUugTH3ZrYL1MDTL9wK2WsRdrWyIYt1dB1p_u5TZMtkjj20lA/s320/10284339485_e2fb304b99.jpg" width="320" /></a></div>
<br />
So let me go straight to the topic. Recently platform software that is used in one of our medium-size C++ projects has changed and started to require some additional shared library. Our project stopped to compile because this new dependency wasn't listed in link libraries. We thought that the fix will be as simple as appending one item to the list, but we were proven wrong. Apparently this new shared library was linked with several other shared objects and this complicated things a little bit.<br />
<br />
So the linker could be potentially satisfied just with the dependency itself, but this is not what happens (at least with GNU ld). The linker tries to minimize a chance that some symbols will be unresolved in run-time, so it checks whether all of the symbols are in place, even those coming from the dependent shared object! This effectively means that the linker will go through all of the unresolved symbols in every shared object and check them, regardless whether it's needed or not. I think that this is a nice feature, but there's one caveat - the linker doesn't actually know where to look for sub-dependent shared objects. This must be explicitly specified.<br />
<br />
And here comes the unintuitive thing - <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">rpath-link</span>. Let me start with <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">rpath</span>, though. When the program is built and then run, the run-time linker (a.k.a loader) will look for all libraries linked to it and map them to the process memory space. There is specific order where the linker will look for the libraries: it will honor <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">rpath</span> in the first place, then it will consider <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">LD_LIBRARY_PATH</span> environment variable and finally standard system places. The <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">rpath-link</span> is similar to <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">rpath</span>, but... it's related to link-time and not run-time (this is this not intuitive part, or is it?). The linker will utilize <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">rpath-link</span> to look for sub-dependent libraries (Fig. 1).<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY8i5YRGqbc3YWpSNZVksYHJNxBHTNxHlKA6KEp8xSzsYpsZ-9JdADbZ3C6e0s65OGscazPR8nkYC3yIPDUtUqMBAx2t56JrSdJX76HJbnMbrafxlNcjRJPMjWQpPpH78tDQjyTnAk_LM/s1600/cmake_rpath_link.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="131" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY8i5YRGqbc3YWpSNZVksYHJNxBHTNxHlKA6KEp8xSzsYpsZ-9JdADbZ3C6e0s65OGscazPR8nkYC3yIPDUtUqMBAx2t56JrSdJX76HJbnMbrafxlNcjRJPMjWQpPpH78tDQjyTnAk_LM/s400/cmake_rpath_link.png" width="400" /></a></div>
<br />
<div style="text-align: center;">
<i>Figure. 1. Flags for project dependencies and dependencies' dependencies (sub-dependencies)</i></div>
<br />
Okay, but what this all has to do with CMake?<br />
<br />
In CMake if you want to provide linker flags you need to utilize <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">CMAKE_*_LINKER_FLAGS</span> family of variables. And it's working perfectly fine unless you need to provide custom toolchain and sysroot (e.g. you are doing crosscompilation). Traditional way to provide custom toolchain with CMake is to pass <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">-DCMAKE_TOOLCHAIN_FILE</span> to <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">cmake</span> command. CMake will then process the toolchain file before <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">CMakeLists.txt</span> allowing to change compilers, sysroots etc. It's worth mentioning here that toolchain file will be processed before project command.<br />
<br />
As it turned out, it works flawlessly for all of the variables like <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">CMAKE_CXX_FLAGS</span> etc. but <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">CMAKE_*_LINKER_FLAGS</span> variables. It took us some time to discover that... call to project command actually resets value of the <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">CMAKE_*_LINKER_FLAGS</span> variables. What a pesky side effect! After some digging I've found the culprit in file <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">Modules/CMakeCommonLanguageInclude.cmake</span>:<br />
<br />
<span style="color: blue; font-family: "courier new" , "courier" , monospace;"># executable linker flags</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: #38761d;">set</span> (<span style="color: #cc0000;">CMAKE_EXE_LINKER_FLAGS</span> <span style="color: magenta;">"${CMAKE_EXE_LINKER_FLAGS_INIT}"</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> <span style="color: #cc0000;">CACHE</span> <span style="color: #cc0000;">STRING</span> <span style="color: magenta;">"Flags used by the linker."</span>) </span><br />
<br />
As you can imagine <span style="color: #cc0000;">CMAKE_EXE_LINKER_FLAGS_INIT</span> is not set to anything reasonable. So the result of this is that CMake doesn't care what you had in <span style="color: #cc0000;">CMAKE_*_LINKER_FLAGS</span>. It simply drops what've been there and replaces it with empty string. In one word - disservice.<br />
<br />
This is not end of the story, though. CMake had one more surprise for us - to our misfortune. As some of you know, there's this useful <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">variable_watch</span> command available in recent versions of CMake. It allows to watch variable for all reads and writes. At the beginning we were sure that something within our build system is changing our <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">CMAKE_EXE_LINKER_FLAGS</span> so we used <span style="color: #cc0000; font-family: "courier new" , "courier" , monospace;">variable_watch</span> to figure out what's going on. However it didn't indicate any modification. In the end it turned out that it does not signal anything that happens under the bonnet (e.g. it ignores what project command does to observed variable). This is a joke!<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8vbaYufQLv0O8rKuS6TCAqTfLssrXImHtzNtNK_EnraCwkMXXPUA_nyej5o6SM5H9L85ZCHC7JG5C8eQgJjdXtHOuf6E3-cXEGdtJRVf1MG7wNfKKhf7eRL1Pejeq6ssf373mr_-5Blc/s1600/cctv-fail.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="335" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8vbaYufQLv0O8rKuS6TCAqTfLssrXImHtzNtNK_EnraCwkMXXPUA_nyej5o6SM5H9L85ZCHC7JG5C8eQgJjdXtHOuf6E3-cXEGdtJRVf1MG7wNfKKhf7eRL1Pejeq6ssf373mr_-5Blc/s400/cctv-fail.jpg" width="400" /></a></div>
<br />
Building systems like CMake is definitely not an easy task. Especially if you are targeting lots of users and support lot of scenarios ranging from compiling simple one-file project to cross-compiling huge project with huge amount of dependencies. In our case several things contributed to the final effect, but I wouldn't say that custom linker flags in a toolchain files is a corner case. I think it should be definitely fixed some day.<br />
<br />
Or maybe this is a feature and I'm the dumb one?<br />
<br />
<b>tl;dr:</b><br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: #38761d;">cmake_minimum_required</span>(VERSION 3.0)</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: #38761d;">set</span>(<span style="color: #cc0000;">CMAKE_EXE_LINKER_FLAGS</span> <span style="color: magenta;">"-Wl,-rpath-link=/opt/lib64"</span>) </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: #38761d;">project</span>(GreatProject)</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="color: blue;"># tada! CMAKE_EXE_LINKER_FLAGS is now empty.</span> </span>bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-56880857173922799262015-12-31T13:52:00.000-08:002015-12-31T13:52:23.708-08:00You know you're no-life when you do things like this during new year's eve :)<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlBpspxnjb8kK90qOoad8_gbelQMFeKxxXqhPhviKNtEIvVyaXpfqvya3zKqalhohhTzUC2-9OUwkJKDQvbL5YNJqkfNrRgu5WFrcsaVZtO6Ghc9aGZpIEVSAr0ICS5oR65K5e8wOHqQQ/s1600/happy_new_year_2016.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="366" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlBpspxnjb8kK90qOoad8_gbelQMFeKxxXqhPhviKNtEIvVyaXpfqvya3zKqalhohhTzUC2-9OUwkJKDQvbL5YNJqkfNrRgu5WFrcsaVZtO6Ghc9aGZpIEVSAr0ICS5oR65K5e8wOHqQQ/s640/happy_new_year_2016.png" width="640" /></a></div>
<br />
Happy 0x7e0 folks! :Dbjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-47207415467524318412015-12-09T12:10:00.001-08:002015-12-10T00:49:29.612-08:00Obvious assumptions can result in bugs in software: Flask in embedded worldSo the other day our team at Nokia was requested to implement web user interface for some embedded device. Usually I write in C++, but I was excited to take this task because we were able to put Python in place. Of course it's also possible to write web applications purely in C++ (e.g. with <a href="http://leaningtech.com/cheerp/" target="_blank">Cheerp</a> - I encourage you to at least read about this project), but only thin back-end was required, so going with C++ didn't seem to be good choice. After cross-compiling Python to ARM (we anticipated that knowing how to do this would be common knowledge in the Internet, but again there were few resources) it was a time to pick some framework. We decided to use <a href="http://flask.pocoo.org/" target="_blank">Flask</a> - small cousin of <a href="https://www.djangoproject.com/" target="_blank">Django</a>, because<br />
<br />
<ul>
<li>it's smaller than Django and we were already going to eat a lot of disk space with Python itself</li>
<li>it seemed like a perfect candidate for the task, because it looked so simple</li>
</ul>
<br />
I had my own goal as well :) I didn't know Flask, so there was something new to learn.<br />
<br />
Fast forward to a place when I was about to deploy my application on the target device (after long struggle to have everything cross-compiled). Everything was set up, so I eagerly clicked the button in the front-end (react.js-based stuff) to prove that the back-end is working as well. As you can imagine - it didn't work.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieGHSYe_q8HKltoc9JAI3PV3rihM9nGfiPvm5uPjBZO3d616xV7se5y1C3qwoYLQ7lE2cyNKy8mgGcIVdbObUFVN5DncryG-A5UFNknmISHHHRI9IhZdaZ24nySGhXvIASWYbtOP1e-dE/s1600/fn63697_01.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="218" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieGHSYe_q8HKltoc9JAI3PV3rihM9nGfiPvm5uPjBZO3d616xV7se5y1C3qwoYLQ7lE2cyNKy8mgGcIVdbObUFVN5DncryG-A5UFNknmISHHHRI9IhZdaZ24nySGhXvIASWYbtOP1e-dE/s320/fn63697_01.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">http://thenextweb.com/wp-content/blogs.dir/1/files/2015/09/fn63697_01.jpg</td></tr>
</tbody></table>
<div style="text-align: left;">
Following is what I saw in logs. Some paths were masked, just-in-case. So after looking at this I thought to myself what the... It was working on my x86 machine!</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjALexNyo1rlofbF-RFi6zJy7Nh4SjxXhIvjZTbmsP2oIRnKjHXgfY9Iioao8iiwRktTYCVNJKSnzgL10M5dFMd8ZmpFr0uV6KDhy02MkRJVXuGfG4T_jqzKg0Nxg1gy9Ztp7b0i1h6EEE/s1600/works-on-my-machine-300x196.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjALexNyo1rlofbF-RFi6zJy7Nh4SjxXhIvjZTbmsP2oIRnKjHXgfY9Iioao8iiwRktTYCVNJKSnzgL10M5dFMd8ZmpFr0uV6KDhy02MkRJVXuGfG4T_jqzKg0Nxg1gy9Ztp7b0i1h6EEE/s1600/works-on-my-machine-300x196.jpg" /></a></div>
<br />
In the rest of this post I'd like to describe a reason of this failure, which I found somewhat funny.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">192.168.255.*** - - [01/Jan/2004 02:00:22] "POST /************************* HTTP/1.1" 500 -</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">Traceback (most recent call last):</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1836, in __call__</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> return self.wsgi_app(environ, start_response)</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1820, in wsgi_app</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> response = self.make_response(self.handle_exception(e))</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1403, in handle_exception</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> reraise(exc_type, exc_value, tb)</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1817, in wsgi_app</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> response = self.full_dispatch_request()</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1479, in full_dispatch_request</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> response = self.process_response(response)</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1693, in process_response</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> self.save_session(ctx.session, response)</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 837, in save_session</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> return self.session_interface.save_session(self, session, response)</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/sessions.py", line 326, in save_session</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> val = self.get_signing_serializer(app).dumps(dict(session))</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 569, in dumps</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> rv = self.make_signer(salt).sign(payload)</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 412, in sign</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> timestamp = base64_encode(int_to_bytes(self.get_timestamp()))</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 220, in int_to_bytes</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> assert num >= 0</span><br />
<br />
Ok, so normally I'm reading tracebacks in a bottom-up fashion. After looking at the last item in the traceback I instantly felt that something really weird is happening. Name of the file wasn't making things less weird as well. Items above the last one are not telling much more, so I had to dive in into source code to look for other hints.<br />
<br />
Function <span style="font-family: "courier new" , "courier" , monospace;">int_to_bytes</span>, at the very beginning, asserts that the input number is positive. Quite normal thing to do, right? So I started a debugger to check what value it got instead. I hunched that it would be -1 or something like that. Dark clouds were already in my mind at the time. I was even afraid that the hardware itself does not support some feature. That's not rare in embedded world, after all.<br />
<br />
With the debugger I was able to retrieve value passed in. Apparently it was <span style="font-family: "courier new" , "courier" , monospace;">-220917729</span>. Immediately I realized that I'm facing some issues with the time, and Python interpreter itself is not to be blamed. So I jumped into <span style="font-family: "courier new" , "courier" , monospace;">get_timestamp</span> to see how the timestamp is created.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">def get_timestamp(self):</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> """Returns the current timestamp. This implementation returns the</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> seconds since 1/1/2011. The function must return an integer.</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> """</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"> return int(time.time() - EPOCH)</span><br />
<br />
I double-checked that <span style="font-family: "courier new" , "courier" , monospace;">time.time()</span> gives positive number. So after factoring out call to <span style="font-family: "courier new" , "courier" , monospace;">time()</span> was obvious that the <span style="font-family: "courier new" , "courier" , monospace;">EPOCH</span> is making the result negative. Then I needed to assemble two facts: <span style="font-family: "courier new" , "courier" , monospace;">EPOCH</span> in this function is set to <span style="font-family: "courier new" , "courier" , monospace;">1/1/2011</span> and the target device starts with time set to <span style="font-family: "courier new" , "courier" , monospace;">1/1/2004</span>. This yielded negative timestamp and a spectacular failure in result.<br />
<br />
So I find this one funny, because apparently the authors didn't expect that the system time could be set to 2004-something :) Perhaps they assumed that it's not possible to run with a date earlier than the day when the library was published :)<br />
<br />
Nevertheless I was expecting errors to appear in my configuration. You don't deploy web applications on embedded devices that often. Frankly speaking I encountered other funny problems during the day as well. One of them was with nginx, and if the time permits I'll write post about it as well. Stay tuned ;]<br />
<br />
<div style="text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj1tvFCOTQwb5hyphenhyphenBOBRjqC27aJ_Ai8VHhyphenhyphen65Xm67-wUTUFNkW2ZjELbAgwZ8nMiImdRf7Fi8K3r828TCv8baZ8-DPIYQXWwDvIu5c9mPA08-Gf1Vn9p4cuJKdaOxoOVbEgUGWDTNe2XUDk/s1600/Expect-the-unexpected.jpg" imageanchor="1"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj1tvFCOTQwb5hyphenhyphenBOBRjqC27aJ_Ai8VHhyphenhyphen65Xm67-wUTUFNkW2ZjELbAgwZ8nMiImdRf7Fi8K3r828TCv8baZ8-DPIYQXWwDvIu5c9mPA08-Gf1Vn9p4cuJKdaOxoOVbEgUGWDTNe2XUDk/s320/Expect-the-unexpected.jpg" width="246" /></a></div>
<br />
Note: I intentionally skipped description of internal Flask organization. In short words Flask uses <a href="https://pypi.python.org/pypi/itsdangerous" target="_blank">ItsDangerous</a> package to have signed session storage at the client-side. You can read more about it <a href="http://pythonhosted.org/itsdangerous/" target="_blank">here</a>.<br />
<div>
<br /></div>
bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-55674339599006454752015-12-09T10:05:00.001-08:002015-12-09T10:05:07.000-08:00Video: Check if number is a palindrome in PythonSo few months ago I posted this post about finding whether given number is a palindrome in Python using as little characters as possible. Out of the blue the team behind <a href="https://www.webucator.com/" target="_blank">Webucator</a> made great video basing on my blog post. To be honest I didn't anticipate such a thing and I'm very happy that it happened.<br />
<br />
The video itself is excellent so I encourage you to watch it.<br />
<br />
On their site you can find more videos. I believe it's also worth to mention that they offer pretty comprehensive <a href="https://www.webucator.com/programming/python.cfm" target="_blank">Python classes</a>.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe width="320" height="266" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/j1Xu79YNcEQ/0.jpg" src="https://www.youtube.com/embed/j1Xu79YNcEQ?feature=player_embedded" frameborder="0" allowfullscreen></iframe></div>
<br />bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-4865693984977036502015-07-29T04:59:00.000-07:002015-07-29T08:01:09.791-07:00Python: palindrome number test challengeRecently, in one of the <a href="http://check.io/">check.io</a> missions i was playing with a task to find first palindromic prime number (palprime) greater than given argument. The challenge was to make code as small as possible. To check whether a number is a palprime we need to check whether it is both a prime and a palindromic number. In this post I'd like to focus on checking whether a number is palindromic.<br />
<br />
Spoiler alert: think twice before reading this post if you're going to play the game.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiaqlHMWitWjUh58N8Uo1zY6eR1qG9vIHG2NPEbLzTcto42LV2pcx-7jWK4eHDVoW5AItu2J8yPr6xKsx4yfb_PLxHqqYpJs0N5SrpJoErXAY42OF0ffd4dKIZLsIrWGP09NxjEAosHVoE/s1600/tumblr_nf8x7keJkF1rkt29io1_500.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiaqlHMWitWjUh58N8Uo1zY6eR1qG9vIHG2NPEbLzTcto42LV2pcx-7jWK4eHDVoW5AItu2J8yPr6xKsx4yfb_PLxHqqYpJs0N5SrpJoErXAY42OF0ffd4dKIZLsIrWGP09NxjEAosHVoE/s320/tumblr_nf8x7keJkF1rkt29io1_500.png" width="320" /></a></div>
<div style="text-align: center;">
<span style="font-size: xx-small;">(source: http://40.media.tumblr.com/42ae198800aa76ad3c06878fd93a2d82/tumblr_nf8x7keJkF1rkt29io1_500.png)</span></div>
<br />
I believe most Python devs out there would quickly come with intuitive solution, presented below.<br />
<pre id="vimCodeElement" style="background-color: seashell; font-size: 13px; white-space: pre-wrap;"><span class="Identifier" style="color: #006f6f; font-size: 1em;">str</span>(n)==<span class="Identifier" style="color: #006f6f; font-size: 1em;">reversed</span>(<span class="Identifier" style="color: #006f6f; font-size: 1em;">str</span>(n))</pre>
The approach here is pretty straightforward - instead of doing arithmetics (like <a href="http://stackoverflow.com/questions/199184/how-do-i-check-if-a-number-is-a-palindrome" target="_blank">here</a>) we can simply convert the number to a string and then compare it with reversed version of it. The solution consists of 24 characters. Can we make it smaller?<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFNCUo4p0pWto5Yt47Wj-g_zkJiOlxRLUKWjphfvXTAWPc612w_ArjUi5HTyq5jd5vi82zxv7enHHPCk-ZMn-xTeTUUt_EEVAhHo2b6QtFFPWZlF_1-M8JgIMAsOx6rSvKmu6CwL1oofI/s1600/cardboard_phone_2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="212" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFNCUo4p0pWto5Yt47Wj-g_zkJiOlxRLUKWjphfvXTAWPc612w_ArjUi5HTyq5jd5vi82zxv7enHHPCk-ZMn-xTeTUUt_EEVAhHo2b6QtFFPWZlF_1-M8JgIMAsOx6rSvKmu6CwL1oofI/s320/cardboard_phone_2.jpg" width="320" /></a></div>
<div style="text-align: center;">
<span style="font-size: xx-small;">(source: http://gemssty.com/wp-content/uploads/2013/03/cardboard_phone_2.jpg)</span></div>
<br />
It will be obvious to seasoned Python programmers to use fabulous Python slices, as shown below.<br />
<pre id="vimCodeElement" style="background-color: seashell; font-size: 13px; white-space: pre-wrap;"><span class="Identifier" style="color: #006f6f; font-size: 1em;">str</span>(n)==<span class="Identifier" style="color: #006f6f; font-size: 1em;">str</span>(n)[::-<span class="Constant" style="color: deeppink; font-size: 1em;">1</span>]</pre>
With this simple transformation we saved 4 characters - that's nice. How does it work? Python slices allow specifying start, stop, and step. If we omit start and stop and provide -1 as a step slicing will work in magical way: it will go through the sequence in right-to-left direction. As a side note I can add that slices in Python work in a little weird way - e.g. <span style="font-family: Courier New, Courier, monospace;">"abcd"[0:len("abcd"):-1]</span> gives <span style="font-family: Courier New, Courier, monospace;">""</span> in result while <span style="font-family: Courier New, Courier, monospace;">"abcd"[len("abcd"):0:-1]</span> renders <span style="font-family: Courier New, Courier, monospace;">"dcb"</span>. <span style="font-family: Courier New, Courier, monospace;">[::-1]</span> specifying start and stop parameters would require us to append firs character to the result: <span style="font-family: Courier New, Courier, monospace;">"abcd"[len("abcd"):0:-1]+"a"</span>.<br />
<br />
So, going back to the topic, are we stuck with this solution? If we can invert the logic (that won't be always the case), then there's other way that I discovered during trials. It required me to adapt rest of the code, but no single character was introduced. Instead, I saved one more.<br />
<pre id="vimCodeElement" style="background-color: seashell; font-size: 13px; white-space: pre-wrap;">n-<span class="Identifier" style="color: #006f6f; font-size: 1em;">int</span>(<span class="Identifier" style="color: #006f6f; font-size: 1em;">str</span>(n)[::-<span class="Constant" style="color: deeppink; font-size: 1em;">1</span>])</pre>
In essence, the idea is that if the number is palindromic then subtracting reversed version of it from the original version should result in 0. Here we are converting to string just to reverse the number and then there's back-to-integer conversion.<br />
<br />
So I ended up with 19 characters thinking about what else I can do. I literally ran out of ideas, so I though of googling for operators available in Python language. During my life I managed to ensure myself that even if you think you have mastered your language of choice, you are wrong (C++ helped me to shape this attitude with stuff like template unions, secret shared_ptr c-tors etc. ;)).<br />
<br />
In the haystack of results I found a needle. <a href="https://docs.python.org/2.7/reference/expressions.html#string-conversions" target="_blank">It turns out</a> that in Python 2.x there's a possibility to convert a number (even if it's held in a variable) to a string using backticks (a.k.a backquotes, reverse quotes). Well... what can I say. I never expected existence of such thing, especially in Python.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_uS7nODLDmsdXoF9UxjcILlE1QL1sP3MIlvpmAKKVGhGbIQ0e8k7-o1x61ZTTS9Kbd8JFhA2xmXc4EZJTTRlfgLxEJzhPZdkdzmzgs8kUSraRlNoBv7GjJC6_MIIWG812vgpSrhLjHxM/s1600/what.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_uS7nODLDmsdXoF9UxjcILlE1QL1sP3MIlvpmAKKVGhGbIQ0e8k7-o1x61ZTTS9Kbd8JFhA2xmXc4EZJTTRlfgLxEJzhPZdkdzmzgs8kUSraRlNoBv7GjJC6_MIIWG812vgpSrhLjHxM/s320/what.jpg" width="320" /></a></div>
<br />
Fortunately it does the job, so was able to use this feature and cut down length of the test to 14. Awesome!<br />
<pre id="vimCodeElement" style="background-color: seashell; font-size: 13px; white-space: pre-wrap;">`n`==`n`[::-<span class="Constant" style="color: deeppink; font-size: 1em;">1</span>]</pre>
Above solution is the shortest I was able to come with. I wouldn't be surprised if there were shorter ones, though. If you know how to make it shorter, please let me know in comments.<br />
<br />
Besides making this check so short I was curious what are differences between conversions and reversions in terms of performance. I decided to give it a go on my machine (i5-3320M 2.6GHz).<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">python -m timeit 'n=2**30; str(n)'</span><br />
<span style="font-family: Courier New, Courier, monospace;">python -m timeit 'n=2**30; repr(n)'</span><br />
<span style="font-family: Courier New, Courier, monospace;">python -m timeit 'n=2**30; `n`'</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEie_5o7CSqSxtQFao0Dxf4FJu6slr_4ICEBtEO6N056dWwjT-OYvCEj60lZMAuYpSTxqq9PCc9UL4R63iBb748mevG9w8-ZN-4CB00PqFuUFRev4uBY7w-7_-mmMw7yHldVce02vQ0FleA/s1600/figure_1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="241" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEie_5o7CSqSxtQFao0Dxf4FJu6slr_4ICEBtEO6N056dWwjT-OYvCEj60lZMAuYpSTxqq9PCc9UL4R63iBb748mevG9w8-ZN-4CB00PqFuUFRev4uBY7w-7_-mmMw7yHldVce02vQ0FleA/s320/figure_1.png" width="320" /></a></div>
<br />
<span style="font-family: Courier New, Courier, monospace;">python -m timeit 's=qwertyuiopasfghjklzxcvbnm; rs=''.join(reversed(s))'</span><br />
<span style="font-family: Courier New, Courier, monospace;">python -m timeit 's=qwertyuiopasfghjklzxcvbnm; rs=s[::-1]'</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSrAshQTVqEkCoy16khTvA5acH-2_kzQ1Jmu_tdYkPTV3NgmfilQX2vTDaJ3I_e3Y-t9XotrSX2Jk8Fj7__En0zudNBwNxmzw5pqOttHf4V9Qqc8_zIR-6TksFf0jluDs0lKxBTq1z75A/s1600/figure_2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="241" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSrAshQTVqEkCoy16khTvA5acH-2_kzQ1Jmu_tdYkPTV3NgmfilQX2vTDaJ3I_e3Y-t9XotrSX2Jk8Fj7__En0zudNBwNxmzw5pqOttHf4V9Qqc8_zIR-6TksFf0jluDs0lKxBTq1z75A/s320/figure_2.png" width="320" /></a></div>
My conclusions are:<br />
<ul>
<li>there's very slight (almost unnoticeable) difference between three methods of converting integer to string</li>
<li>for some reason Python3.4 is slower than Python2.7</li>
<li>Pypy cruelly outperforms Python, as usual. If you care about performance you should use pypy.</li>
<li>there's almost no difference between reverse(n) and [::-1] when it comes to speed</li>
</ul>
<div>
The fact that there are no performance differences between reverse() and [::-1] surprised me a bit. By looking into byte code I can only assume that in my particular case different set of operations yielded the same performance results.</div>
<div>
<br /></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">reversed [::-1]</span><br />
<span style="font-family: Courier New, Courier, monospace;">-------------------------------------------------------------------<br /> 0 LOAD_CONST 0 ('qwer...') </span><span style="font-family: 'Courier New', Courier, monospace;">0 LOAD_CONST 0 ('qwer...')</span></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> 3 STORE_NAME 0 (s) </span><span style="font-family: 'Courier New', Courier, monospace;">3 STORE_NAME 0 (s)</span></div>
<div>
<b><span style="font-family: Courier New, Courier, monospace;"> 6 LOAD_CONST 1 ('') </span><span style="font-family: 'Courier New', Courier, monospace;">6 LOAD_NAME 0 (s)</span></b></div>
<div>
<b><span style="font-family: Courier New, Courier, monospace;"> 9 LOAD_ATTR 1 (join) </span><span style="font-family: 'Courier New', Courier, monospace;">9 LOAD_CONST 1 (None)</span></b></div>
<div>
<b><span style="font-family: Courier New, Courier, monospace;">12 LOAD_NAME 2 (reversed) </span><span style="font-family: 'Courier New', Courier, monospace;">12 LOAD_CONST 1 (None)</span></b></div>
<div>
<b><span style="font-family: Courier New, Courier, monospace;">15 LOAD_NAME 0 (s) </span><span style="font-family: 'Courier New', Courier, monospace;">15 LOAD_CONST 3 (-1)</span></b></div>
<div>
<b><span style="font-family: Courier New, Courier, monospace;">18 CALL_FUNCTION 1 (1 pos, 0 kw) </span><span style="font-family: 'Courier New', Courier, monospace;">18 BUILD_SLICE 3</span></b></div>
<div>
<b><span style="font-family: Courier New, Courier, monospace;">21 CALL_FUNCTION 1 (1 pos, 0 kw) </span><span style="font-family: 'Courier New', Courier, monospace;">21 BINARY_SUBSCR</span></b></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">24 STORE_NAME 3 (rs) </span><span style="font-family: 'Courier New', Courier, monospace;">22 STORE_NAME 1 (rs)</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">27 LOAD_CONST 2 (None) </span><span style="font-family: 'Courier New', Courier, monospace;">25 LOAD_CONST 1 (None)</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">30 RETURN_VALUE </span><span style="font-family: 'Courier New', Courier, monospace;">28 RETURN_VALUE</span></div>
</div>
<div>
<br /></div>
<div>
In case of conversion from integer to a string I anticipated that <span style="font-family: Courier New, Courier, monospace;">str</span> and <span style="font-family: Courier New, Courier, monospace;">repr</span> will yield similar results. What I didn't know was how backticks will perform. By looking into byte code I can say that the only difference is that backticks don't call any function, but use <span style="font-family: Courier New, Courier, monospace;">UNARY_CONVERT</span> op directly.<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">str ``</span><br />
<span style="font-family: Courier New, Courier, monospace;">----------------------------------------------------------------------------<br /> 0 LOAD_CONST 0 (1234567890) </span><span style="font-family: 'Courier New', Courier, monospace;">0 LOAD_CONST 0 (1234567890)</span><br />
<span style="font-family: Courier New, Courier, monospace;"> 3 STORE_NAME 0 (n) </span><span style="font-family: 'Courier New', Courier, monospace;">3 STORE_NAME 0 (n)</span><br />
<b><span style="font-family: Courier New, Courier, monospace;"> 6 LOAD_NAME 1 (str) </span><span style="font-family: 'Courier New', Courier, monospace;">6 LOAD_NAME 0 (n)</span></b><br />
<b><span style="font-family: Courier New, Courier, monospace;"> 9 LOAD_NAME 0 (n) </span><span style="font-family: 'Courier New', Courier, monospace;">9 UNARY_CONVERT</span></b><br />
<span style="font-family: Courier New, Courier, monospace;"><b>12 CALL_FUNCTION 1</b> </span><span style="font-family: 'Courier New', Courier, monospace;">10 STORE_NAME 1 (s)</span><br />
<span style="font-family: Courier New, Courier, monospace;">15 STORE_NAME 2 (s) </span><span style="font-family: 'Courier New', Courier, monospace;">13 LOAD_CONST 1 (None)</span><br />
<span style="font-family: Courier New, Courier, monospace;">18 LOAD_CONST 1 (None) </span><span style="font-family: 'Courier New', Courier, monospace;">16 RETURN_VALUE</span><br />
<span style="font-family: Courier New, Courier, monospace;">21 RETURN_VALUE </span><br />
<br />
<br /></div>
bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com2tag:blogger.com,1999:blog-7030930638413070841.post-32478835719242898402015-07-21T13:23:00.000-07:002015-07-21T13:23:32.305-07:00CheckIo.org - A game for software engineers :)<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi11pkjEomk8BC7FkgFB6Pblb3iFqHFLku3qgj8w1ITyj1a4RP5pDHI3vuk7jROjLirUdOPR-asBYJBRSDMXVz45hCSzv3HuyMsd1pVpKRKknqYu0q_nRTgM2kzAFUv6Au7nmejx2do8Ho/s1600/check.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="175" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi11pkjEomk8BC7FkgFB6Pblb3iFqHFLku3qgj8w1ITyj1a4RP5pDHI3vuk7jROjLirUdOPR-asBYJBRSDMXVz45hCSzv3HuyMsd1pVpKRKknqYu0q_nRTgM2kzAFUv6Au7nmejx2do8Ho/s320/check.png" width="320" /></a></div>
<br />
I've just found an excellent way of spending an evening as a programmer - playing a game designed for software engineers :)<br />
<br />
Since I discovered that site by an accident I think it is worth to mention it here so maybe you will also get interested. In the game you, in essence, solve various tasks to enable more "levels".<br />
<br />
The only one drawback I can see is that only Python is supported (I'm Python enthusiast, but I know a lot of people that are more into Go, JavaScript etc.).<br />
<br />
To give you a quick grasp without spoilers let me post here my solution for the very first (and very basic) task - FizzBuzz. This is the simplest one and can be quickly implemented, but I'm kind of a person that like to over-engineer.<br />
<br />
<pre id="vimCodeElement" style="background-color: seashell; font-size: 13px; white-space: pre-wrap;"><span class="Statement" style="color: brown; font-size: 1em; font-weight: bold;">return</span> <span class="Constant" style="color: deeppink; font-size: 1em;">''</span>.join([[<span class="Constant" style="color: deeppink; font-size: 1em;">'Fizz '</span>, <span class="Constant" style="color: deeppink; font-size: 1em;">''</span>][(n%<span class="Constant" style="color: deeppink; font-size: 1em;">3</span>)!=<span class="Constant" style="color: deeppink; font-size: 1em;">0</span>], [<span class="Constant" style="color: deeppink; font-size: 1em;">'Buzz'</span>, <span class="Constant" style="color: deeppink; font-size: 1em;">''</span>][(n%<span class="Constant" style="color: deeppink; font-size: 1em;">5</span>)!=<span class="Constant" style="color: deeppink; font-size: 1em;">0</span>]]).strip() <span class="Statement" style="color: brown; font-size: 1em; font-weight: bold;">or</span> <span class="Identifier" style="color: #006f6f; font-size: 1em;">str</span>(n)</pre>
<br />
You think this is weird? Check out solution from <a href="http://www.checkio.org/mission/fizz-buzz/publications/nickie/python-3/string-arithmetic/?ordering=most_voted" target="_blank">nickie</a> :)<br />
<br />
<pre id="vimCodeElement" style="background-color: seashell; font-size: 13px; white-space: pre-wrap;"><span class="Statement" style="color: brown; font-size: 1em; font-weight: bold;">return</span> (<span class="Constant" style="color: deeppink; font-size: 1em;">"Fizz "</span>*(<span class="Constant" style="color: deeppink; font-size: 1em;">1</span>-n%<span class="Constant" style="color: deeppink; font-size: 1em;">3</span>)+<span class="Constant" style="color: deeppink; font-size: 1em;">"Buzz "</span>*(<span class="Constant" style="color: deeppink; font-size: 1em;">1</span>-n%<span class="Constant" style="color: deeppink; font-size: 1em;">5</span>))[:-<span class="Constant" style="color: deeppink; font-size: 1em;">1</span>]<span class="Statement" style="color: brown; font-size: 1em; font-weight: bold;">or</span> <span class="Identifier" style="color: #006f6f; font-size: 1em;">str</span>(n)</pre>
<br />
Happy hacking!!bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-17738191557054722015-05-01T01:32:00.001-07:002015-05-01T01:32:53.877-07:00KnowCamp Wrocław / More functional C++14Video relation from the first KnowCamp organized by Nokia and held in Wrocław.<br />
My presentation "More functional C++14" is available below (link <a href="http://www.slideshare.net/szborows/more-functional-c14">here</a>). BaSz's presentation "C++14 prostszy niż kiedykolwiek" (eng."C++14 simpler than ever") is <a href="http://baszerr.eu/lib/exe/fetch.php/docs/cpp14_prostszy_niz_kiedykolwiek.pdf">here</a>, but only in Polish.<br />
<br />
<iframe allowfullscreen="" frameborder="0" height="315" src="https://www.youtube.com/embed/NBPd8Uy8TH8" width="560"></iframe>
<br />
<div style="text-align: center; width: 100%;">
<iframe allowfullscreen="" frameborder="0" height="355" marginheight="0" marginwidth="0" scrolling="no" src="//www.slideshare.net/slideshow/embed_code/key/r0ABtPWj1qskyx" style="border-width: 1px; border: 1px solid #CCC; display: block; margin-bottom: 5px; max-width: 100%;" width="425"></iframe></div>
<br />
<center>
<div style="margin-bottom: 5px;">
<strong> <a href="https://www.slideshare.net/szborows/more-functional-c14" target="_blank" title="More functional C++14">More functional C++14</a> </strong> from <strong><a href="https://www.slideshare.net/szborows" target="_blank">szborows</a></strong> </div>
</center>
bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com0tag:blogger.com,1999:blog-7030930638413070841.post-5028837583725196382015-04-06T14:39:00.002-07:002015-05-09T06:18:17.762-07:00Real Boost.multi_index performance<div class="" style="clear: both; text-align: left;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPXhb2WpEt58op7RxDsaa_ZjWPTUAuwPDOdBx7w9X_KYcHHJ1KZ_gCj0d6FJWaiBC8UWYBXtvEtRlp7iqPdC7f0hzbN_2yZEBzooK0YqVA7J84LWvrfGCdrrBFZu3R8YeaMFK_aQE-szU/s1600/download.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><br /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPXhb2WpEt58op7RxDsaa_ZjWPTUAuwPDOdBx7w9X_KYcHHJ1KZ_gCj0d6FJWaiBC8UWYBXtvEtRlp7iqPdC7f0hzbN_2yZEBzooK0YqVA7J84LWvrfGCdrrBFZu3R8YeaMFK_aQE-szU/s1600/download.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><br /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPXhb2WpEt58op7RxDsaa_ZjWPTUAuwPDOdBx7w9X_KYcHHJ1KZ_gCj0d6FJWaiBC8UWYBXtvEtRlp7iqPdC7f0hzbN_2yZEBzooK0YqVA7J84LWvrfGCdrrBFZu3R8YeaMFK_aQE-szU/s1600/download.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="108" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPXhb2WpEt58op7RxDsaa_ZjWPTUAuwPDOdBx7w9X_KYcHHJ1KZ_gCj0d6FJWaiBC8UWYBXtvEtRlp7iqPdC7f0hzbN_2yZEBzooK0YqVA7J84LWvrfGCdrrBFZu3R8YeaMFK_aQE-szU/s1600/download.jpg" width="200" /></a><br />
<b>Update 09.05.2015:</b> thanks to <a href="http://bannalia.blogspot.com/" target="_blank">Joaquín</a> it turned out that I had slight bug in my plotting script - x tick labels were wrong. All pictures have been fixed.<br />
<br />
Several months ago I prepared and conducted a presentation about multi_index library from Boost at Nokia TechMeetUp. Essentially I showed that it is pretty useful library that should be used when there's a need to represent the same data with multiple "views". For instance, if we have to access the same list of something sorted by different attributes, Boost.MultiIndex seems to be very good choice as the documentation claims strong consistency and, what's also important, better performance. What's also interesting is that Facebook's library - <a href="https://github.com/facebook/folly">folly</a> - utilizes that in its TimeoutQueue class.<br />
</div>
<div class="separator" style="clear: both; text-align: left;">
In the middle of the presentation I put some performance graphs which I mindlessly copied them from the documentation. Had I known that they are old, I'd never do that. Fortunately it was pointed out by <a href="http://baszerr.eu/">friend</a> of mine. These graphs in fact illustrate performance gains, but measurements were made with ancient compiler versions like GCC 3.4.5.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
After the presentation I promised myself that I'll investigate this a bit, but as usual time didn't permit. Till now.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
I made measurements with newer versions of GCC and Clang. Unfortunately I have no access to Visual Studio. If you can provide me VS stats I'd be pleased, though.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Here are my results for:</div>
<div class="separator" style="clear: both; text-align: left;">
CPU: i5-3320M</div>
<div class="separator" style="clear: both; text-align: left;">
RAM: 8GB</div>
<div class="separator" style="clear: both; text-align: left;">
OS: Debian 7</div>
<div class="separator" style="clear: both; text-align: left;">
Compilation flags: -O3</div>
<div class="separator" style="clear: both; text-align: left;">
Boost version: 1.57</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Images from documentation are placed at the right side.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div style="width: 100%;">
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7YbDjjQJG9Q-Z35b0nW6COkq_f_no_TNGdKXOl4llfGJaWHfOFV1PrYUa_INshntjdhn6BjxtPBpF7-bX5XS1EUWvW9IZvrq39BpAhN59XK8xx4aIntf0R-DCJQBVFfrGQVVopfW8R8Q/s1600/1_ordered_index.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7YbDjjQJG9Q-Z35b0nW6COkq_f_no_TNGdKXOl4llfGJaWHfOFV1PrYUa_INshntjdhn6BjxtPBpF7-bX5XS1EUWvW9IZvrq39BpAhN59XK8xx4aIntf0R-DCJQBVFfrGQVVopfW8R8Q/s320/1_ordered_index.png" width="320" /></a></div>
<div style="float: left; width: 49%;">
<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7MmMWh46CFcvrSVOYSwAsXXbSRx4ige8ptSRYL942ZTjkbq8LyoFFHpP01-g6QXltpeYlr-2UFlWKot0jhUP1i5TMGR0IgYym3ImhN3JuigMykQKqCRRwO8hvICQwTXdXe_de5gajmnA/s1600/1_ordered_index.orig.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7MmMWh46CFcvrSVOYSwAsXXbSRx4ige8ptSRYL942ZTjkbq8LyoFFHpP01-g6QXltpeYlr-2UFlWKot0jhUP1i5TMGR0IgYym3ImhN3JuigMykQKqCRRwO8hvICQwTXdXe_de5gajmnA/s320/1_ordered_index.orig.png" width="270" /></a></div>
<br style="clear: both;" /></div>
<div class="separator" style="clear: both; text-align: left;">
In the case of ordered index it's visible that either standard containers were improved or compilers learned how to optimize code using them better. Nevertheless, there is visible trend showing that with more than 10<sup>6</sup> entries, multi_index container could perform better than naive approach.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div style="width: 100%;">
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6XatoGyMWjhjYXxujTpvOQZJRQdB4NVeRZuIyEGGjhGqe-tZ_gWb93ZPKJ8e2RYDRpJOJtOQ5XIN67Hq0vYR6R8X7BrlvhCLIBtsaDP-nI4ju5H2Vz7WSz6ZNdnU97UYgP4SgTWzsUa8/s1600/1_ordered_1_sequenced.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6XatoGyMWjhjYXxujTpvOQZJRQdB4NVeRZuIyEGGjhGqe-tZ_gWb93ZPKJ8e2RYDRpJOJtOQ5XIN67Hq0vYR6R8X7BrlvhCLIBtsaDP-nI4ju5H2Vz7WSz6ZNdnU97UYgP4SgTWzsUa8/s320/1_ordered_1_sequenced.png" width="320" /></a></div>
<br />
<br />
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQkIpvWIO1EqShAlbPc1lPZYbcj8RpRB6wtb4DlyowUK4VDcN0-r-6aOQzd0i5ITKD3YYv8lLCV88-AH6m34pgPJBNNVGgsprEAk9sxKkYB8MUrV9DgnTVQt0_k0tUMw-5h3cFxcOpIu4/s1600/1_ordered_1_sequenced.orig.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQkIpvWIO1EqShAlbPc1lPZYbcj8RpRB6wtb4DlyowUK4VDcN0-r-6aOQzd0i5ITKD3YYv8lLCV88-AH6m34pgPJBNNVGgsprEAk9sxKkYB8MUrV9DgnTVQt0_k0tUMw-5h3cFxcOpIu4/s320/1_ordered_1_sequenced.orig.png" width="270" /></a></div>
<br style="clear: left;" /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
In case of two indexes - one indexed and the second sequenced situation changes. In the original chart from the documentation we can observe huge improvement in favor to multi_index starting from 10<sup>5</sup> elements. My results are quite different - there is no performance gain with multi_index. For 10<sup>6</sup> elements the difference is dramatic comparing the two diagrams: ~130% and ~30%.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div style="width: 100%;">
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXfUgg2An1MUS1x7Z87crL4xVK6rewAHEnJ77bbR4hGegFn6dKB1ERIpUaFWIQYWZh0lbGUoIeLKHPzc_0fF8UbDOUEgfjP4Wphe-0H0fnXOzzjNaKvDaQLPW140Ef2yG05HJQ0oO_quY/s1600/1_sequenced_index.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXfUgg2An1MUS1x7Z87crL4xVK6rewAHEnJ77bbR4hGegFn6dKB1ERIpUaFWIQYWZh0lbGUoIeLKHPzc_0fF8UbDOUEgfjP4Wphe-0H0fnXOzzjNaKvDaQLPW140Ef2yG05HJQ0oO_quY/s320/1_sequenced_index.png" width="320" /></a></div>
</div>
<div style="float: left; width: 49%;">
<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2xMUPNEGIDXgQhTZQh7zX1JEsQN1ETr6palD3sALB4LL5ik8Udz_QHZqTZ2PYclWtCOPXKGvT1u0KTvzjMiP3nasPVetbGdlTODhaPp3iLWjp0KJDJpSPwjDzQOkCJZ1wLFab9wXhDr0/s1600/1_sequenced_index.orig.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2xMUPNEGIDXgQhTZQh7zX1JEsQN1ETr6palD3sALB4LL5ik8Udz_QHZqTZ2PYclWtCOPXKGvT1u0KTvzjMiP3nasPVetbGdlTODhaPp3iLWjp0KJDJpSPwjDzQOkCJZ1wLFab9wXhDr0/s320/1_sequenced_index.orig.png" width="270" /></a></div>
<br style="clear: left;" />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
Interestingly, on my machine multi_index performed better in case of only 1 sequenced index. This is surprise for me, because in the documentation it is opposite. Well, in this scenario multi_index outperformed naive approach.<br />
Rest of diagrams are presented below. As you can see they illustrate that multi_index documentation is outdated. Sometimes there are some performance gains, but not that huge as presented in diagrams from documentation.<br />
<br />
<center>
<div style="width: 60%;">
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhkkXKgUU49Nu9WGTfNI7MNS3u9IhiCSkwFpQJjpChcepZjUWKAYPkolP5NonZKvtGU8sL9PqwNWtiU0wr85Htw704IYPTwPtfC9EDttwEXwZmPFC5JiGW2lM8YO4Qr8_vHkr6_au8vfys/s1600/2_ordered_1_sequenced.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="150" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhkkXKgUU49Nu9WGTfNI7MNS3u9IhiCSkwFpQJjpChcepZjUWKAYPkolP5NonZKvtGU8sL9PqwNWtiU0wr85Htw704IYPTwPtfC9EDttwEXwZmPFC5JiGW2lM8YO4Qr8_vHkr6_au8vfys/s200/2_ordered_1_sequenced.png" width="200" /></a></div>
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVYVzFMbPnZc_Rs3xhtt7ik9HdbYt-uiTTT2teWbixsbQULcwUjI1IAhzSkH1YIkAm1mU-GTUg2KKVwPxY0p62lVcZf6Y8OQt03j85gS4KlFb2Qb7CYTawXivUGjQ6fJGCBDnmt6vvXE0/s1600/2_ordered_1_sequenced.orig.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="133" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVYVzFMbPnZc_Rs3xhtt7ik9HdbYt-uiTTT2teWbixsbQULcwUjI1IAhzSkH1YIkAm1mU-GTUg2KKVwPxY0p62lVcZf6Y8OQt03j85gS4KlFb2Qb7CYTawXivUGjQ6fJGCBDnmt6vvXE0/s200/2_ordered_1_sequenced.orig.png" width="200" /></a></div>
<br style="clear: left;" /></div>
</center>
<br />
<center>
<div style="width: 60%;">
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggyLxHo_8CCOgRRZXZdOFCHnLGmy7V2AIQvcgVsdaknQy2BtRGZIFmekZYdiJ1_29Rg-wL_X4FRhLxKc4efS4M_D1YiDlHaZqwzxJZXtA5BaGLVT-TP5kcwmFq4aP85_pnegsdV2hApdQ/s1600/3_ordered.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="150" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggyLxHo_8CCOgRRZXZdOFCHnLGmy7V2AIQvcgVsdaknQy2BtRGZIFmekZYdiJ1_29Rg-wL_X4FRhLxKc4efS4M_D1YiDlHaZqwzxJZXtA5BaGLVT-TP5kcwmFq4aP85_pnegsdV2hApdQ/s200/3_ordered.png" width="200" /></a></div>
<br />
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjV2s0cuq63RFPx09z5H7wfM1QfPDDNXNh8wedx3thbh-TyeWcO05oRjX4_SDlivO2NbcU2bPD79WexDt220U7CxoJJiDwDusdm9k74CRT-T34V33UInh7u3PRY5jN8sgIS_Qb9Carfcmo/s1600/3_ordered.orig.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="133" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjV2s0cuq63RFPx09z5H7wfM1QfPDDNXNh8wedx3thbh-TyeWcO05oRjX4_SDlivO2NbcU2bPD79WexDt220U7CxoJJiDwDusdm9k74CRT-T34V33UInh7u3PRY5jN8sgIS_Qb9Carfcmo/s200/3_ordered.orig.png" width="200" /></a></div>
<br style="clear: left;" /></div>
</center>
<br />
<br />
<center>
<div style="width: 60%;">
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiw5DwuXQHLJDFbSGNXPRclzOEEKjGYmJ2ygM7SdNqwC9T-ixQpvd97cGjDgWibXcCLtGODY_8P1fzjs-_zw-uTxrlsGvxak2HyExjB1otWyrwi_Grj_SJhPsohkrcQCOK-9NEQpdtiahs/s1600/2_ordered_indices.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="150" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiw5DwuXQHLJDFbSGNXPRclzOEEKjGYmJ2ygM7SdNqwC9T-ixQpvd97cGjDgWibXcCLtGODY_8P1fzjs-_zw-uTxrlsGvxak2HyExjB1otWyrwi_Grj_SJhPsohkrcQCOK-9NEQpdtiahs/s200/2_ordered_indices.png" width="200" /></a></div>
<div style="float: left; width: 49%;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhk-kXd8LO5ePVcDkFHSQ63aVXSnjnRvmppEORiGAZaK2ScmA2V1WL_lINSO8njr3KYd_fexnwMBaCwOw7jYclLGHz_VtnSRlWUkobkmHMoSjkVbQLQpsnuAf7B7BXthR5-K6encShf3dU/s1600/2_ordered_indices.orig.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="133" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhk-kXd8LO5ePVcDkFHSQ63aVXSnjnRvmppEORiGAZaK2ScmA2V1WL_lINSO8njr3KYd_fexnwMBaCwOw7jYclLGHz_VtnSRlWUkobkmHMoSjkVbQLQpsnuAf7B7BXthR5-K6encShf3dU/s200/2_ordered_indices.orig.png" width="200" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br style="clear: left;" /></div>
</center>
Undoubtedly this deserves deeper investigation. Especially it should be discovered why something that had to perform better stopped to do so.<br />
Unfortunately currently I have no time to dig in into this. (Maybe in next days I'll have some time to check how cache-friendly multi_index library is). Nevertheless, everything that I showed pushes me even harder into opinion that everything should be measured - even 3rd-party things with a lot of promises in the documentation.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMKhwZDh0tG0zSrSIVU-lWXnMa2MBFzc0jAZDjMg_KQxK5b5gM8A-QWe3ejhQzYfR_PZO_NnYMSBfLAtB9AKvPpX9ai3erGwmXnMzQMTh2oze5Y1MpM8hkGT0JHFetncn3Q0PGg7ye5Ew/s1600/broken_promises_by_herrfous1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="268" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMKhwZDh0tG0zSrSIVU-lWXnMa2MBFzc0jAZDjMg_KQxK5b5gM8A-QWe3ejhQzYfR_PZO_NnYMSBfLAtB9AKvPpX9ai3erGwmXnMzQMTh2oze5Y1MpM8hkGT0JHFetncn3Q0PGg7ye5Ew/s1600/broken_promises_by_herrfous1.jpg" width="400" /></a></div>
<div style="text-align: center;">
<a href="http://lanceeasley.com/2014/03/14/take-what-you-get-and-dont-react-adopt-an-attitude/"><span style="color: #cccccc;">image source</span></a></div>
bjkhttp://www.blogger.com/profile/17578946623234104344noreply@blogger.com7