niedziela, 25 stycznia 2015

type_traits: GCC vs Clang

Recently I was writing another article for polish developer journal "Programista" (eng. Programmer). This time I decided to focus on type_traits library - what can we find inside, how it is implemented, and finally - what can we expect in the future (mainly in terms of compile-time reflection).

While writing I was not referring to any particular implementation, but Clang and GCC implementations were opened on the second monitor all the times. I noticed some differences that I'd like to document here.

First things first. Whole post is based on:
The first peculiar difference is that one of the first things that appear in the top of GCC's type_traits are interesting template structures: __and_, __or_, __not_, and so on:
 100   template<typename...>
 101     struct __or_;
 102 
 103   template<>
 104     struct __or_<>
 105     : public false_type
 106     { };
 107 
 108   template<typename _B1>
 109     struct __or_<_B1>
 110     : public _B1
 111     { };

They are used in following fashion:
 709   /// is_unsigned
 710   template<typename _Tp>
 711     struct is_unsigned
 712     : public __and_<is_arithmetic<_Tp>, __not_<is_signed<_Tp>>>::type
 713     { };

At first glance everything looks nice. However you won't find such things in Clang's type_traits implementation. Clang approach is to use standard template mechanisms, like template specialization:
 688 // is_unsigned
 689 
 690 template <class _Tp, bool = is_integral<_Tp>::value>
 691 struct __libcpp_is_unsigned_impl : public integral_constant<bool, _Tp(0) < _Tp(-1)> {};
 692 
 693 template <class _Tp>
 694 struct __libcpp_is_unsigned_impl<_Tp, false> : public false_type {};  // floating point
 695 
 696 template <class _Tp, bool = is_arithmetic<_Tp>::value>
 697 struct __libcpp_is_unsigned : public __libcpp_is_unsigned_impl<_Tp> {};
 698 
 699 template <class _Tp> struct __libcpp_is_unsigned<_Tp, false> : public false_type {};
 700 
 701 template <class _Tp> struct _LIBCPP_TYPE_VIS_ONLY is_unsigned : public __libcpp_is_unsigned<_Tp> {};

There's no doubt - GCC's version is more human-friendly. We can read it like it was a book and everything is clear. Reading Clang version is much more harder. Code is bloated with template stuff and there is a lot of kinky helpers.
On the other hand Clang developers use things that are shipped with the C++ compiler. Therefore, at least in theory, compilation should take less time for Clang implementation. It's not that easy to test this, but there's fancy new tool out there - templight. It is a tool that allows us to debug and profile template instances ;)
After some time playing with templight I got following results for simple program that just includes type_traits library, but in two versions: GCC and Clang.

GCCClang
Template instantiations*35
Template memoizations*5945
Maximum memory usage~957kB~2332kB

When we sum instantiations and memoizations it turns out that assumption was right - Clang implementation of type_traits library should be slightly faster. But of course it depends on how compilers are implemented.
Another interesting fact is that during compilation of Clang's type_traits a lot more memory is consumed (comparing to GCC's version).
Both versions of type_traits were compiled using Clang 3.6 (SVN) with templight plugin, so there's no data related to GCC. It may be that for GCC GCC's version of type_traits is better.

The second thing that I've noticed is that Clang's type_traits strive to not rely on compiler built-ins as much as possible while GCC does. Good example here are implementations for std::is_class and std::is_enum.
GCC approach here is to simply use compiler built-ins, like __is_class and __is_enum.
 411   /// is_class
 412   template<typename _Tp>
 413     struct is_class
 414     : public integral_constant<bool, __is_class(_Tp)>
 415     { };
...
 399   /// is_enum
 400   template<typename _Tp>
 401     struct is_enum
 402     : public integral_constant<bool, __is_enum(_Tp)>
 403     { };

 487   { "__is_class",   RID_IS_CLASS,   D_CXXONLY },
 489   { "__is_enum",    RID_IS_ENUM,    D_CXXONLY },

Clang, on contrary, utilizes metaprogramming tricks developed by the community. In case of std::is_class implementation is based on function overloading - the first version of function __is_class_imp::__test accepts pointer to a member and thus will be chosen by the compiler only if the type is a class or union. Therefore second check is needed as well. Simple and brilliant.
 413 namespace __is_class_imp
 414 {
 415 template <class _Tp> char  __test(int _Tp::*);
 416 template <class _Tp> __two __test(...);
 417 }
 418 
 419 template <class _Tp> struct _LIBCPP_TYPE_VIS_ONLY is_class
 420     : public integral_constant<bool, sizeof(__is_class_imp::__test<_Tp>(0)) == 1 && !is_union<_Tp>::value> {};


In case of enums there's also interesting bit in Clang implementation:
 500 template <class _Tp> struct _LIBCPP_TYPE_VIS_ONLY is_enum
 501     : public integral_constant<bool, !is_void<_Tp>::value             &&
 502                                      !is_integral<_Tp>::value         &&
 503                                      !is_floating_point<_Tp>::value   &&
 504                                      !is_array<_Tp>::value            &&
 505                                      !is_pointer<_Tp>::value          &&
 506                                      !is_reference<_Tp>::value        &&
 507                                      !is_member_pointer<_Tp>::value   &&
 508                                      !is_union<_Tp>::value            &&
 509                                      !is_class<_Tp>::value            &&
 510                                      !is_function<_Tp>::value         > {};

Voila! No built-ins needed ;)

The last thing I'd like to mention here is std::is_function. GCC's approach here is a bit ridiculous, because it needs a lot of specializations (there are 24 of them so I don't want to include 'em all).
 490   template<typename _Res, typename... _ArgTypes>
 491     struct is_function<_Res(_ArgTypes......) volatile &&>
 492     : public true_type { };
 493 
 494   template<typename _Res, typename... _ArgTypes>
 495     struct is_function<_Res(_ArgTypes...) const volatile>
 496     : public true_type { };
 497 
 498   template<typename _Res, typename... _ArgTypes>
 499     struct is_function<_Res(_ArgTypes...) const volatile &>
 500     : public true_type { };
 501 
 502   template<typename _Res, typename... _ArgTypes>
 503     struct is_function<_Res(_ArgTypes...) const volatile &&>
 504     : public true_type { };

I think that this double template unpacking (or nested unpacking - I don't know how to name this) is good example how to make implementation look like a nightmare. When you have hammer in your hand everything is looking like a nail, isn't it ;)?



* Memoization - "Memoization means we are _not_ instantiating a template because it is already instantiated (but we entered a context where wewould have had to if it was not already instantiated)."

czwartek, 15 stycznia 2015

C++14 template variables in action

As some of you know, in November I conducted a presentation "Metaprogramming in C++: from 70's to C++17" at Code::Dive conference. One of things that I mentioned about C++ templates was what actually can be a template. One of the things that I covered was template variables from C++14. In this post I'd like to present them in action.

However, before I move forward I'd like to see if you can spot the mistake I made during the presentation. I said that following things can be templated in C++. Can you point out the mistake?
  • C++98: class, structure, (member) function
  • C++11: using directive
  • C++14: variable
Do you see it?

The question is tricky. It turns out that since C++98 it was possible to make template union as well. I don't know how about you, but I haven't use union for more than five years from now. I saw it recently as a storage for optional class, though. It's a rare thing, but definitely worth knowing.

Okay, enough off-topic. Let's get back to template variables from C++14. This feature was introduced with N3651 proposal. The proposal argues that in C++ we should have legal mechanisms instead of workarounds like static const variable defined in a class, or value returned from constexpr function.

But what are the use cases of this? The proposal gives us example use case presented below.

1 template <typename T>
2 constexpr T pi = T(3.1415926535897932385);
3 
4 template <typename T>
5 T area_of_circle_with_radius(T r) {
6     return pi<T> * r * r;
7 }

Although this is complete example it was quite hard for me to figure out other use cases in two shakes. The breakthrough came with this commit to Clang's libcxx library. It used this very new feature to make simpler versions of type_traits templates. I hope that following listing is illustrating this well.

 1 namespace ex = std::experimental;
 2 
 3 // ...
 4 {
 5     typedef void T;
 6     static_assert(ex::is_void_v<T>, "");
 7     static_assert(std::is_same<decltype(ex::is_void_v<T>), const bool>::value, "");
 8     static_assert(ex::is_void_v<T> == std::is_void<T>::value, "");
 9 }
10 {
11     typedef int T;
12     static_assert(!ex::is_void_v<T>, "");
13     static_assert(ex::is_void_v<T> == std::is_void<T>::value, "");
14 }
15 {
16     typedef decltype(nullptr) T;
17     static_assert(ex::is_null_pointer_v<T>, "");
18     static_assert(std::is_same<decltype(ex::is_null_pointer_v<T>), const bool>::value, "");
19     static_assert(ex::is_null_pointer_v<T> == std::is_null_pointer<T>::value, "");
20 }
21 // ...

So from this commit onward we are able to write more concise and more readable code in one shot. This nice addition to C++ is practical not only for template variables, but also for making things easier.