Programming soup: listopada 2014

Today we had another interesting conversation at the office. It all started with me explaining to my colleague what static keyword does in respect to free functions. At that time I was wondering whether the same rules apply to variables defined in the scope of translation unit. And... no surprise (luckily not today). However the conversation evolved... and I ended up trying to implement my own compile-time TypeId functionality with no memory footprint (one that vanishes from binary). Please don't ask me how we jumped from static keyword to TypeId ;) It just happened and I'd like to share some thoughts about it with you.

TypeId? Wait.. what exactly is it?
With moderate Google-fu (or having this knowledge before) it takes less than minute to find out typeid operator which is available since C++98. It returns std::type_info which contains some information about type provided as an argument. It is worth noticing that returned value is a lvalue and it lives till the end of your application. What does it mean? Yes, you're right. It all happens in run-time. Since this is not what I wanted to have, this solution was unacceptable.
Even if it would be acceptable - this type information is not available if you don't compile your program with RTTI (Run Time Type Information, -frtti switch for Clang/GCC). It means that there's performance penalty. We have to pay for it, but we'd rather like not to. What do we do, then?

Black magic applied
To solve this problem in compile-time we have to somehow map distinct types to some unique identifiers. I believe there are a lot of approaches over there, but my favorite one is to employ function addresses. Why function addresses? Because every two distinct functions are guaranteed to have different (unique) addresses. In other words two functions can't share one address in memory. Compiler must do this that way, for sure.
However, how can we make use of this fact to create identifier↔type mapping? It's easy - we'll treat addresses of static member functions from class templates as our identifiers. Following snippet illustrates this idea.

1 using TypeId = uintptr_t;
2 
3 template <typename T>
4 struct TypeIdGenerator {
5     static TypeId GetTypeId() {
6         return reinterpret_cast<uintptr_t>(&GetTypeId);
7     }
8 };

So now you may be asking yourself a question about this weird uintptr_t type. Why this particular type? Why not simply integer? It is all because size of a pointer (in this scenario pointer to method) can be of bigger size than other types like int or long. Actually C++ standard does not say a word about size of embedded types, but this is topic for other blog post ;-). The size of unitptr_t is guaranteed to be the same as size of pointers - and that saves us a day.
There's also one more caveat here, though. Unlike POSIX, C++ standard does not support casting pointers to functions to other pointers or scalar types. It is because code may (theoretically) be located in different kind of memory (with different size of words etc.) than data. That is the main reason why I was forced to use reinterpret_cast in above example.

Back to functions
Okay, but still there is one problem with presented code snippet. It leaves trace in assembly (and thus also in binary file):

 1 # g++-4.9.1, -std=c++14 -Os
 2 .LHOTB0:
 3     #        TypeIdGenerator<int>::GetTypeId()
 4     .weak    _ZN15TypeIdGeneratorIiE9GetTypeIdEv
 5     .type    _ZN15TypeIdGeneratorIiE9GetTypeIdEv, @function
 6 _ZN15TypeIdGeneratorIiE9GetTypeIdEv:
 7 .LFB2:
 8     .cfi_startproc
 9     movl    $_ZN15TypeIdGeneratorIiE9GetTypeIdEv, %eax
10     ret
11     .cfi_endproc

To overcome this I tried to use constexpr functions from C++11. I thought it is a good direction because these functions can be both compile-time and run-time. So maybe compile-time version will have some kind of compile-time address that we can use? And maybe this address will not be present in run-time binary? Let's see.

1 using TypeId = uintptr_t;
2 
3 template <typename T>
4 constexpr inline void GetTypeId() {}

Now we can get address of this free function, but unfortunately it still has to have some room in memory. So I was very wrong. However, we already limited implementation of this function to single ret instruction. At least that ;-).

1 _Z9GetTypeIdIiEvv:
2 .LFB2:
3     .cfi_startproc
4     ret
5     .cfi_endproc

Mission failed. I'm sorry. Maybe you have some other ideas how to achieve what I want? Nevertheless I decided to check one last thing - constexpr objects.

Constexpr objects
This was my last resort, but I knew it's not gonna work...

 1 using TypeId = uintptr_t;
 2 
 3 template <typename Type>
 4 struct GetTypeId {
 5     constexpr GetTypeId() {};
 6 };
 7 
 8 int main(void) {
 9     constexpr GetTypeId<int> object;
10     return reinterpret_cast<TypeId>(&object);
11 }

I know it is merely valid, because of returning reference to auto variable. Also, because this variable is auto and not static, identifiers will vary, depending on in which function this constexpr object was created (on the stack). Too bad. However, it gave very interesting output, presented on following assembly snippet.

1 main:
2 .LFB1:
3     .cfi_startproc
4     leaq    -1(%rsp), %rax
5     ret
6     .cfi_endproc

As you can see, there is nothing related to this object variable. At least we don't see this at first glance. GCC is just using the fact that the object is located "above" on the stack. Therefore leaq instruction is used. I was curious whether Clang would behave similarly in this situation:

1 main:                                   # @main
2     .cfi_startproc
3 # BB#0:
4     movl    $_ZZ4mainE6object, %eax
5     retq

In clang the object has some other address which is not relative to the stack. Interesting. It means that Clang will always allocate space in binary to accommodate this constexpr object. Even if it's not static!
I don't know whether this is a bug in Clang or a bug in GCC. If time permits I'll investigate this further.

Code::dive conference is over. Emotions subsided. In this post I'd like to summarize this conference from my personal point of view.

There were a lot of extremely interesting talks about C++ and not only. Undoubtedly we had one C++ star on the board - Scott Meyers and I'm going to start with it.

Scott gave two talks CPU Caches and why you care and Support for embedded programming in C++11 and C++14. Both of them were meaningful from perspective of a modern C++ engineer. I encourage you to watch these videos!

Besides Scott we had our rising C++ stars from Poland: Andrzej Krzemieński and Bartosz Szurgot. Andrzej was talking about bugs in C++ applications. Bartek's talk was about common pitfalls that occur when we do threading in C++. These two talks were also very inspiring so go watch them as well ;)

I also gave a talk on this conference. My presentation was about Metaprogramming in C++ - from 70's to C++17. As other speaker said - the first time always sucks. And indeed it was my first performance in English. However people are saying that it went quite good so if you have time to waste you can go watch my presentation ;). For the record, slides are here: slideshare.

The last thing I'd like to mention was organisation of this conference. It was first that big conference in Wrocław and I can really say that it went perfectly. Big thanks to organizers!

Programming soup

piątek, 14 listopada 2014

TypeId using constexpr objects

środa, 12 listopada 2014

Code::Dive conference