Meet C++/WinRT 2.0: Optimizing Components

You may have noticed that all the 2.0 entries thus far have been focused on the component author. That’s no coincidence: C++/WinRT 2.0 is very much focused on improving the correctness, efficiency, reliability, and productivity of the developer building a WinRT component. One of the improvements for component developers could not be made without introducing a breaking change, so what I’m describing today is opt-in although it is enabled by default for new projects. An existing project can opt-in using the C++/WinRT compiler’s new -optimize command line option or in Visual Studio by setting the “Optimized” option to true:

First, I’ll describe why this is a cool optimization you should care about and then I’ll talk about how it’s implemented and you’ll understand why this is a breaking change worth applying to existing projects.

-optimize enables what is often called uniform construction. This is a feature that was long requested but eluded me for some time and I am very pleased that we can finally rely on this. Uniform or unified construction is the notion that you can use the C++/WinRT language projection itself to create and use your intra-component types, types that are implemented by your component, without getting into weird loader issues and you can do so efficiently. This solves a few pitfalls that hampered developers building complex components in the past. Imagine you have the following WinRT class (defined in IDL):

namespace Component
{
    runtimeclass Class
    {
        Class();
        void Method();
        static void StaticMethod();
    }
}

Naturally, as a C++ developer familiar with using the C++/WinRT library you might want to use the class as follows:

using namespace winrt::Component;

Class c;
c.Method();
Class::StaticMethod();

And this would be perfectly reasonable, if this code didn’t reside within the same component that implements this class. You see, the thing about C++/WinRT is that as a language projection it shields the developer from the ABI. C++/WinRT never calls directly into the implementation. It always travels through the ABI. Now this is not the ABI that C++ compiler developers talk about. This is the COM-based ABI that WinRT defines. So that first line where you are constructing the Class object actually calls the RoGetActivationFactory function to retrieve the class or activation factory and then uses that factory to create the object. The last line likewise uses the factory to make what appears to be a static method call. Thankfully, C++/WinRT has a blazingly fast factory cache, so this isn’t a problem for apps. The trouble is that within a component you’ve just done something that is a little problematic.

Firstly, no matter how fast the C++/WinRT factory cache is, calling through RoGetActivationFactory or even subsequent calls through the factory cache will always be slower than calling directly into the implementation. A call to RoGetActivationFactory followed by IActivationFactory::ActivateInstance followed by QueryInterface is obviously not going to be as efficient as using a C++ new expression for a locally-defined type. As a consequence, seasoned C++/WinRT developers know to use the make or make_self helper functions when creating objects within a component:

// Class c;
Component::Class c = make<implementation::Class>();

But as you can see, this is not nearly as convenient or concise. Not only must you use a helper function to create the object, you must also disambiguate between the implementation type and the projected type. It’s also easy to forget to do so.

Secondly, using the projection to create the class means that its activation factory will be cached. Normally this is a wonderful thing but if the factory resides in the same DLL that is making the call then you’ve effectively pinned the DLL and prevented it from ever unloading. For many developers this probably doesn’t matter but some system components must support unloading, and this can become rather problematic.

So this is where the term uniform construction comes in. Regardless of whether the code resides in a project that is merely consuming the class or whether the code resides in the project that is actually implementing the class, the developer can freely use the same syntax to create the object:

// Component::Class c = make<implementation::Class>();
Class c;

When the component is built with -optimize, the call through the language projection will compile down to the same efficient call to the make function that directly creates the implementation type and avoid the syntactic complexity, the performance hit of calling through the factory, and the problem of pinning the component in the process.

Uniform construction applies to any call that is served by the factory under the hood. Practically, that means this optimization serves both constructors and statics. Here’s the original example again:

Class c;
c.Method();
Class::StaticMethod();

Without -optimize, the first and last statements require calls through the factory object. With -optimize, neither do and those calls are compiled directly against the implementation and even have the potential of being inlined. This speaks to the other term often used when talking about -optimize, namely direct implementation access. Language projections are nice, but when you can directly access the implementation you can and should take advantage of it to produce the most efficient code possible. Now C++/WinRT will do this for you, without forcing you to leave the safety and productivity of the projection.

So why is this a breaking change? Well, the component must cooperate in order to allow the language projection to reach in and directly access its implementation types. As C++/WinRT is a header-only library, you can peek inside and see what’s going on. Without -optimize, the Class constructor and StaticMethod member are defined by the projection as follows:

namespace winrt::Component
{
    inline Class::Class() :
        Class(impl::call_factory<Class>([](auto&& f) { return f.template ActivateInstance<Class>(); }))
    {
    }
    inline void Class::StaticMethod()
    {
        impl::call_factory<Class, Component::IClassStatics>([&](auto&& f) { return f.StaticMethod(); });
    }
}

You don’t need to understand any of this (and remember never to rely on anything in the impl namespace), but it should be clear that both calls involve a call to some function named “call_factory”. That’s your clue that these calls involve the factory cache and are not directly accessing the implementation. With -optimize, these same functions are not defined at all! Instead, they are declared by the projection and their definitions are left up to the component. The component can then provide definitions that call directly into the implementation. This is where the breaking change comes in. Those definitions are generated for you when you use both -component and -optimize and appear in a file called Type.g.cpp where Type is the name of the WinRT class being implemented. That’s why you may hit various linker errors when you first enable -optimize in an existing project. You need to include that generated file into your implementation to stitch things up. In our example, the Class.h might look like this (regardless of whether -optimize is being used):

// Class.h
#pragma once
#include "Class.g.h"

namespace winrt::Component::implementation
{
    struct Class : ClassT<Class>
    {
        Class() = default;

        static void StaticMethod();
        void Method();
    };
}
namespace winrt::Component::factory_implementation
{
    struct Class : ClassT<Class, implementation::Class>
    {
    };
}

Your Class.cpp is where it all comes together:

#include "pch.h"
#include "Class.h"
#include "Class.g.cpp" // <-- Add this line!

namespace winrt::Component::implementation
{
    void Class::StaticMethod()
    {
    }

    void Class::Method()
    {
    }
}

As you can, following the inclusion (and definition) of the implementation class, Class.g.cpp is included to provide the definitions of those functions that the projection left undefined. Here’s what those definitions look like inside the Class.g.cpp file:

namespace winrt::Component
{
    Class::Class() :
        Class(make<Component::implementation::Class>())
    {
    }
    void Class::StaticMethod()
    {
        return Component::implementation::Class::StaticMethod();
    }
}

So this nicely completes the projection with efficient calls directly into the implementation, avoids those calls to the factory cache, and the linker is satisfied.

The final thing that -optimize does for you is to change the implementation of your project’s module.g.cpp, that helps you to implement your DLL’s DllGetActivationFactory and DllCanUnloadNow exports, in such a way that incremental builds will tend to be much faster by eliminating the strong type coupling that was required by version 1 of C++/WinRT. This is often referred to as type-erased factories. Without -optimize, the module.g.cpp file that is generated for your component starts off by including the definitions of all your implementation classes, the Class.h in this example. It then directly creates the implementation factory for each class as follows:

if (requal(name, L"Component.Class"))
{
    return winrt::detach_abi(winrt::make<winrt::Component::factory_implementation::Class>());
}

Again, you don’t need to understand any of this but it is useful to see that this requires the complete definition for any and all classes implemented by your component. This can have a dramatic effect on your inner loop as any change to a single implementation will cause module.g.cpp to recompile. With -optimize, this is no longer the case. Instead, two things happen to the generated module.g.cpp file. The first is that it no longer includes any implementation classes. In this example, it will not include Class.h at all. Instead, it creates the implementation factories without any knowledge of their implementation:

void* winrt_make_Component_Class();

if (requal(name, L"Component.Class"))
{
    return winrt_make_Component_Class();
}

Obviously, there is no need to include their definitions and its up to the linker to resolve the winrt_make_Component_Class function’s definition. Of course, you don’t need to think about this because the Class.g.cpp file that gets generated for you, and that you previously included to support uniform construction, also defines this function. Here’s the entirety of the Class.g.cpp file that is generated for this example:

void* winrt_make_Component_Class()
{
    return winrt::detach_abi(winrt::make<winrt::Component::factory_implementation::Class>());
}
namespace winrt::Component
{
    Class::Class() :
        Class(make<Component::implementation::Class>())
    {
    }
    void Class::StaticMethod()
    {
        return Component::implementation::Class::StaticMethod();
    }
}

As you can see, the winrt_make_Component_Class function directly creates your implementation’s factory. This all means that you can happily change any given implementation and the module.g.cpp need not be recompiled at all. It is only when you add or remove WinRT classes that the module.g.cpp will be updated and need to be recompiled.

And that’s all for today. Stay tuned for more about C++/WinRT 2.0!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s