TypeId? Wait.. what exactly is it?
With moderate Google-fu (or having this knowledge before) it takes less than minute to find out typeid operator which is available since C++98. It returns std::type_info which contains some information about type provided as an argument. It is worth noticing that returned value is a lvalue and it lives till the end of your application. What does it mean? Yes, you're right. It all happens in run-time. Since this is not what I wanted to have, this solution was unacceptable.
Even if it would be acceptable - this type information is not available if you don't compile your program with RTTI (Run Time Type Information, -frtti switch for Clang/GCC). It means that there's performance penalty. We have to pay for it, but we'd rather like not to. What do we do, then?
Black magic applied
To solve this problem in compile-time we have to somehow map distinct types to some unique identifiers. I believe there are a lot of approaches over there, but my favorite one is to employ function addresses. Why function addresses? Because every two distinct functions are guaranteed to have different (unique) addresses. In other words two functions can't share one address in memory. Compiler must do this that way, for sure.
However, how can we make use of this fact to create identifier↔type mapping? It's easy - we'll treat addresses of static member functions from class templates as our identifiers. Following snippet illustrates this idea.
1 using TypeId = uintptr_t;
2
3 template <typename T>
4 struct TypeIdGenerator {
5 static TypeId GetTypeId() {
6 return reinterpret_cast<uintptr_t>(&GetTypeId);
7 }
8 };
So now you may be asking yourself a question about this weird uintptr_t type. Why this particular type? Why not simply integer? It is all because size of a pointer (in this scenario pointer to method) can be of bigger size than other types like int or long. Actually C++ standard does not say a word about size of embedded types, but this is topic for other blog post ;-). The size of unitptr_t is guaranteed to be the same as size of pointers - and that saves us a day.
There's also one more caveat here, though. Unlike POSIX, C++ standard does not support casting pointers to functions to other pointers or scalar types. It is because code may (theoretically) be located in different kind of memory (with different size of words etc.) than data. That is the main reason why I was forced to use reinterpret_cast in above example.
Back to functions
Okay, but still there is one problem with presented code snippet. It leaves trace in assembly (and thus also in binary file):
1 # g++-4.9.1, -std=c++14 -Os 2 .LHOTB0: 3 # TypeIdGenerator<int>::GetTypeId() 4 .weak _ZN15TypeIdGeneratorIiE9GetTypeIdEv 5 .type _ZN15TypeIdGeneratorIiE9GetTypeIdEv, @function 6 _ZN15TypeIdGeneratorIiE9GetTypeIdEv: 7 .LFB2: 8 .cfi_startproc 9 movl $_ZN15TypeIdGeneratorIiE9GetTypeIdEv, %eax 10 ret 11 .cfi_endproc
To overcome this I tried to use constexpr functions from C++11. I thought it is a good direction because these functions can be both compile-time and run-time. So maybe compile-time version will have some kind of compile-time address that we can use? And maybe this address will not be present in run-time binary? Let's see.
1 using TypeId = uintptr_t; 2 3 template <typename T> 4 constexpr inline void GetTypeId() {}
Now we can get address of this free function, but unfortunately it still has to have some room in memory. So I was very wrong. However, we already limited implementation of this function to single ret instruction. At least that ;-).
1 _Z9GetTypeIdIiEvv: 2 .LFB2: 3 .cfi_startproc 4 ret 5 .cfi_endproc
Mission failed. I'm sorry. Maybe you have some other ideas how to achieve what I want? Nevertheless I decided to check one last thing - constexpr objects.
Constexpr objects
This was my last resort, but I knew it's not gonna work...
1 using TypeId = uintptr_t; 2 3 template <typename Type> 4 struct GetTypeId { 5 constexpr GetTypeId() {}; 6 }; 7 8 int main(void) { 9 constexpr GetTypeId<int> object; 10 return reinterpret_cast<TypeId>(&object); 11 }
I know it is merely valid, because of returning reference to auto variable. Also, because this variable is auto and not static, identifiers will vary, depending on in which function this constexpr object was created (on the stack). Too bad. However, it gave very interesting output, presented on following assembly snippet.
1 main: 2 .LFB1: 3 .cfi_startproc 4 leaq -1(%rsp), %rax 5 ret 6 .cfi_endproc
As you can see, there is nothing related to this object variable. At least we don't see this at first glance. GCC is just using the fact that the object is located "above" on the stack. Therefore leaq instruction is used. I was curious whether Clang would behave similarly in this situation:
1 main: # @main 2 .cfi_startproc 3 # BB#0: 4 movl $_ZZ4mainE6object, %eax 5 retq
In clang the object has some other address which is not relative to the stack. Interesting. It means that Clang will always allocate space in binary to accommodate this constexpr object. Even if it's not static!
I don't know whether this is a bug in Clang or a bug in GCC. If time permits I'll investigate this further.