C++/WinRT: Coroutines and the Thread Pool

Previous: Producing Async Objects

As we saw in the previous installment, creating a basic coroutine is trivial. You can very easily co_await some other async action or operation, simply co_return a value, or craft some combination of the two. To recap, here is a coroutine that is not asynchronous at all:

IAsyncOperation<int> return_123()
{
    co_return 123;
}

Even though it executes synchronously, it still produces a completely valid implementation of the IAsyncOperation interface:

int main()
{
    int result = return_123().get();
    assert(result == 123);
}

Here is one that will wait for five seconds before returning the value:

using namespace std::chrono;

IAsyncOperation<int> return_123_after_5s()
{
    co_await 5s;
    co_return 123;
}

This is ostensibly going to execute asynchronously and yet the main function remains largely unchanged, thanks to the get function’s blocking behavior:

int main()
{
    int result = return_123_after_5s().get();
    assert(result == 123);
}

The co_return statement in the last coroutine will execute on the Windows thread pool, since the co_await expression is a chrono duration that uses a thread pool timer. The co_await statement represents a suspension point and it should be apparent that a coroutine may resume on a completely different thread following suspension. You can also make this explicit using resume_background:

IAsyncOperation<int> background_123()
{
    co_await resume_background();
    co_return 123;
}

There is no apparent delay this time, but the coroutine is guaranteed to resume on the thread pool. What if you are not sure? You might have a cached value and only want to introduce a context switch if the value must be retrieved from latent storage. This is where it is good to remember that a coroutine is also a function, so all the normal rules apply:

IAsyncOperation<int> background_123()
{
    static std::atomic<int> result{0};

    if (result == 0)
    {
        co_await resume_background();
        result = 123;
    }

    co_return result;
}

This is only conditionally going to introduce concurrency. Multiple threads could conceivably race in and call background_123, causing a few of them to resume on the thread pool, but eventually the atomic will be primed and the coroutine will begin to complete synchronously. That is of course the worst case.

Let us imagine the value may only be read from storage once a signal is raised, indicating that the value is ready. We can use two coroutines to pull this off:

handle m_signal{ CreateEvent(nullptr, true, false, nullptr) };
std::atomic<int> m_value{ 0 };

IAsyncAction prepare_result()
{
    co_await 5s;
    m_value = 123;
    SetEvent(m_signal.get());
}

IAsyncOperation<int> return_on_signal()
{
    co_await resume_on_signal(m_signal.get());
    co_return m_value;
}

The first coroutine artificially waits for five seconds, sets the value, and then signals the Win32 event. The second coroutine waits for the event to become signaled, and then simply returns the value. Once again, the thread pool is used to wait for the event, leading to an efficient and scalable implementation. Coordinating the two coroutines is straightforward:

int main()
{
    prepare_result();

    int result = return_on_signal().get();
    assert(result == 123);
}

The main function kicks off the first coroutine but does not block waiting for its completion. The second coroutine immediately begins waiting for the value, blocking as it does so.

Thus far, I’ve focused on the thread pool, or what might be called background threads. C++/WinRT loves the Windows thread pool, but invariably you need to get work back onto a foreground thread representing some user interaction. Join me next time as I explore ways to take precise control over the execution context.