Hello internet! I'd like to inaugurate my blog with a topic that's very dear to me: metaprogramming! In particular, I want to guide you through an introduction to this fascinating discipline in a way that's easy to follow and that gives you the building blocks to write your own tools. Being completely honest, this is an advanced programming topic, but the tools we're going to use are mostly learnt during one's first year of programming.

We're going to be writing C++20, so expect lots of shiny templates! Grab a seat and a compiler and let us answer the fundamental question of this post:

What's the narrowest integer that can hold a particular value?

If you're writing size-bounded data container such as a vector or a stack with a maximum capacity and want to optimise for size, then this exercise can become a useful tool to incorporate in your project. Let's start by understanding the scope of the problem.

Definitions

First of all, we're going to define what we mean by narrowest integer. Integer types are those we use to represent integer numbers, such as char, int or long. With narrow we mean that we're going to be inspecting the width of those types, i.e. their size in bytes accessible through sizeof(<type>). For example, sizeof(int) should be 4 in your machine. I say should because your particular computer architecture may use a different number of bytes to represent an int. That's ok and you shouldn't worry about it, as our implementation is going to be architecture independent.

Also, the type of the solution will be a type. We want a piece of code that will return not the name, but the actual type we're looking for so we can use it to declare other variables. That's the fun of metaprogramming! We're going to put the compiler to work and make our program flexible to the programmer's input!

Metaprogramming techniques with templates

The basics

The tool we use in C++ to write generic code is the template. You can think of templates as blueprints for pieces of code that share a common structure for data and behaviour. Using these blueprints, we can instantiate actual code structures that will be part of our final program. For a simple but complete example, let's take a look at the definition of std::array:

template <typename T, size_t N>
struct array {
    T data[N];
    // Other goodies...
};

The template declaration indicates that we're going to be defining a class template that takes two template parameters:

  • typename T: A symbol that corresponds to the name of a type.
  • size_t N: A value known at compile time of type size_t.

When we declare an object of type std::array<int, 10>, the compiler will substitute all instances of T and N with int and 10 respectively. In this example, the inner body of the class would be substituted by int data[10], an array that holds ten ints just like we wanted. This technique is called template specialisation, meaning that we're specialising the generic type array<T,N> with T=int and N=10. In the case of std::array, the specialisation is full, but we can partially specialise templates by only specialising some of its parameters.

What we want to do now is to use this technique to work with the template parameters as data and manipulate them to make computations at compile time. To achieve this, we're going to add two new ingredients into the mix.

Type aliases and constant expressions

We're now entering the realm of functional programming. Here, there are no variables, just constants values that we can combine to define more constant values. Keep in mind that the values we're going to be working with here are both numeric values and types. That's right, we're going to be manipulating types just like we manipulate numbers. Let's start by declaring a structure template that can hold an alias for any type:

template <typename T>
struct type_holder {
    using type = T; // Equivalent to `typedef T type`
};

This structure contains a type alias called type that we can access through the :: operator. For example, if we declare the specialisation type_holder<char *>, then type_holder<char *>::type will be char *. But how can we assert that with code? After all, we'll want to test our results. To achieve this, we want to take a look at an implementation of integral constants, C++'s tool for representing constant values with templates:

template <typename T, T Value>
struct integral_constant {
    using value_type = T;
    static constexpr T value = Value;
//  ^~~~~~ ^~~~~~~~~
//  |      Must be known at compile time
//  |
//  Accessible from the type with operator ::

    // Other goodies...
};

Our tests need to assert truthfulness and falseness and the standard library defines two specialisations of integral_constant for that purpose:

using true_type  = integral_constant<bool, true>;
using false_type = integral_constant<bool, false>;

// Assert these expressions at compile time or throw a compilation error
static_assert(true_type::value);
static_assert(!false_type::value);

Wait, isn't this a bit overkill? After all, the values true and false are already part of the core language, right? Yes, you're right. However, using std::integral_constant will make our life easier by simplifying the code we have to write from now own. Let's see it in action by writing a structure that will assert the type held in type_holder.

Partial specialisation and inheritance

What we want now is a structure that can tell us if two types are equal. The standard library gives us a structure that does just that called std::is_same. Let's take a look at its implementation

template <typename T, typename U>
struct is_same : std::false_type { };

template <typename T>
struct is_same<T, T> : std::true_type { };
//                     ^~~~~~
//                     Partial specialisations require that we specify the
//                     parameters we're using after the structure's name.
//                     In this case, both parameters are of the same type.

Simple, right? Template instantiation rules specify that the most specialised template will be used if many specialisations are available. In this case, is_same<int, int> will use the specialisation inheriting from true_type because <T, T> is more specialised than <T, U>. On the other hand, is_same<int, bool> will use the one that inherits from false_type because <T, T> cannot be instantiated with two different types.

Now, here's the kicker: since both specialisations inherit from types that are full specialisations of std::integral_constant, they will also inherit their value_type and value members. This means that just by using inheritance we've made one expression's value equal to true and the other, to false.

With this tool, we can write a nice test:

static_assert(std::is_same<type_holder<char *>::type, char *>::value);

In fact, we can simplify this code a bit by using the suffixes _t and _v to directly access the type and value of the structures. First, let's look at how we can define these symbols for both type_holder and is_same:

template <typename T>
using type_holder_t = typename type_holder<T>::type;
//                    ^~~~~~~~
//                    Required as we don't know the type of `T` yet

template <typename T, typename U>
inline constexpr T is_same_v = std::is_same<T, U>::value;

But wait, why are we using _t and _v anyway if we could be concealing the implementation in a deeper namespace and just exposing the types and values in the std namespace? Well, directly accessing the type like this was an afterthought that started to be added to the standard in C++14 and since all C++ code must be backwards compatible, new symbols had to be added for this task. In the case of _v, the qualifier inline constexpr is required to globally expose the value and it wasn't added to the language until C++17, so all your code written in C++11 or 14 will still have to use ::value.

With these two shortcuts, we can simplify the previous test quite a bit:

static_assert(std::is_same_v<type_holder_t<char *>, char *>);

Nice! We now have a tool to check that our templates are choosing the correct type! Let's now start our work to find the proverbial narrowest integer. In order to find it, we will have to introduce the last tool we'll learn in this article.

Packs, recursivity, and finding the narrowest integer

Declaring the structures

In order to find the narrowest integer, we will have to provide a list of integer types to search through. We're going to define our list with four fixed-width integer types: int8_t, int16_t, int32_t, and int64_t. These type are required to be at least their announced number of bytes wide, so there's no need to call sizeof on them to find their size and we can order them from narrowest to widest just by looking at their names. Of course, you're free to expand the list with other types you like.

Let's declare the structures we're going to be working with:

namespace impl {
    template <size_t Value, typename... Integers>
    struct narrowest_integer_for;
}

template <size_t Value>
struct narrowest_integer_for :
        impl::narrowest_integer_for<Value, int8_t, int16_t, int32_t, int64_t>
{ };

There's a bit to unpack here (pun intended). First, we have the template parameter typename... Integers, which declares a template parameter pack. Parameter packs allow us to write variadic templates that receive an arbitrary number of arguments. In the previous code, impl::narrowest_integer_for will be instantiated with one numeric value and any number of types.

We also have a second narrowest_integer_for in the global namespace that takes the Value parameter. This structure inherits from impl::narrowest_integer_for, passing both the Value parameter and a pack of types. We're doing this to define our own list of integer types to search through and allow the user to just specify the value they want to use for the search.

Parameter pack destructuring

What we want to do now is iterate through our parameter pack and inspect every type to see if it can hold our value. We're going to tackle the first part of the problem first: iterating through the pack. To do this, we're going to destructure the pack and consume it recursively.

Wait, what does this all mean? Let's take a look at the code first and explore this technique with a concrete example:

namespace impl {
    // A preliminary declaration is required to declare specialisations
    template <size_t Value, typename... Integers>
    struct narrowest_integer_for;

    // Base case: The pack only contains one type
    template <size_t Value>
    struct narrowest_integer_for<Value> {
        using type = void;
    };

    // General case, we destructure the pack as (X:XS)
    template <size_t Value, typename X, typename... XS>
    struct  narrowest_integer_for<Value, X, XS...> :
            narrowest_integer_for<Value, XS...>
    { };
}

If you've worked we Haskell before you may be familiar with the (x:xs) idiom. If you haven't don't worry, we're going to explain how it works.

Let's focus first on the base case structure. The parameter typename... Integers is gone! This means that the base case will be reached when there are no elements left in the pack to consume. If we haven't found an integer type wide enough to represent Value, we want to return an error. Since we cannot use void as the type of an object, we assign it in the base case as an error-like type.

In the general case, we're destructuring Integers into X, which will call the head, and XS..., which will call the tail. This means that when we instantiate this template, the argument pack will be divided into two parts: a head containing the first element and a tail containing the rest. For example, this is how the first pack will be destructured:

template <size_t Value>
struct narrowest_integer_for :
        impl::narrowest_integer_for<Value, int8_t, int16_t, int32_t, int64_t>
//                                         ^~~~~~  ^~~~~~~~~~~~~~~~~~~~~~~~~
//                                         X       XS...
//                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
//                                         Integers
{ };

As you can see, impl::narrowest_integer_for<Value, X, XS...> recursively inherits from impl::narrowest_integer_for<Value, XS...>, consuming the head of the pack. The next time we instantiate the pack, it will only contain the types int16_t, int32_t, and int64_t and so on. Without more operations, this structure will always yield the type void, as it unconditionally consumes the pack until it reaches the base case. Let's introduce our final structure, one that will let us check if the type X can be used to represent Value.

Conditional type assignment

Our algorithm to find the narrowest type is very simple:

  • For every type X in the pack:
    • If Value is less or equal than the highest number X can hold:
      • Return X
  • If no suitable type was found in the pack:
    • Return void

We've already taken care of the last condition by iterating through the parameter pack. For the first condition, we need a structure that will let us choose between two types depending on the result of a condition. The standard library defines std::conditional to let us do just that. We're going to be using this structure, but let's take a look at its implementation first:

template <bool Expr, typename TrueT, typename FalseT>
struct conditional;

template <typename TrueT, typename FalseT>
struct conditional<true, TrueT, FalseT> {
    using type = TrueT;
};

template <typename TrueT, typename FalseT>
struct conditional<false, TrueT, FalseT> {
    using type = FalseT;
};

template <bool Expr, typename TrueT, typename FalseT>
using conditional_t = typename conditional<Expr, TrueT, FalseT>::type;

If you've been following this article, then there should be no mysteries here. This structure lets us select the first (TrueT) or second (FalseT) type depending on the value of the boolean expression.

What we need for the condition is the highest number a particular type can hold. To get it, we're going to use the numeric limits library, which will let us access the maximum value of an integer type. For example:

static_assert(std::numeric_limits<int8_t>::max() == 127);
//            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
//            This function is consteval-qualified, which means that its
//            value must and will always be known at compile time!

With this last piece of the puzzle, let's re-write the general case structure of narrowest_integer_for to include this check:

template <size_t Value, typename X, typename... XS>
struct narrowest_integer_for<Value, X, XS...> : std::conditional<
        (Value <= std::numeric_limits<X>::max()),
        X,
        typename narrowest_integer_for<Value, XS...>::type>
{ };

Every time we instantiate this structure, we check if Value is less or equal than the highest number X can hold. If it is, we assign X to type and the operation is over. Otherwise, we recurse into narrowest_integer_for<Value, XS...> and assign its type member to the type member of this structure instead.

As this is a recursive operation, the value of type in any level N will also be assigned to all levels [0..N-1]. Thus, no matter how deep we are into the search, the final value of type will be accessible from the top-level structure. This is exactly what we want, as we'll always access the type member from the definition of narrowest_integer_for that lives in the global namespace.

In conclusion

The problem is solved: we have created a structure capable of determining the narrowest integer that can hold a particular value at compile time.

In this article we have taken a look into how template metaprogramming works and how to perform some operations with templates. Using std::integral_constant and many other structures found in the type traits library you can make your programs behave more flexibly and take tons on decisions on compile time.

Here's the final code for the exercise we've been working on:

#include <cstddef>
#include <cstdint>
#include <limits>
#include <type_traits>

namespace impl {
    template <size_t Value, typename... Integers>
    struct narrowest_integer_for;

    template <size_t Value>
    struct narrowest_integer_for<Value> {
        using type = void;
    };

    template <size_t Value, typename X, typename... XS>
    struct narrowest_integer_for<Value, X, XS...> : std::conditional<
            (Value <= std::numeric_limits<X>::max()),
            X,
            typename narrowest_integer_for<Value, XS...>::type>
    { };
}

template <size_t Value>
struct narrowest_integer_for :
        impl::narrowest_integer_for<Value, int8_t, int16_t, int32_t, int64_t>
{ };

template <size_t Value>
using narrowest_integer_for_t = typename narrowest_integer_for<Value>::type;

static_assert(std::is_same_v<narrowest_integer_for_t<75>,           int8_t>);
static_assert(std::is_same_v<narrowest_integer_for_t<250>,          int16_t>);
static_assert(std::is_same_v<narrowest_integer_for_t<195518>,       int32_t>);
static_assert(std::is_same_v<narrowest_integer_for_t<237254386759>, int64_t>);

I hope that you enjoyed reading this first article of mine and that you learnt some techniques that you can use in your programs. If you'd like to discuss this article, feel free to send me an email at the address listed at the top.

Thank you for reading and have a great day!