Thinking Async | Alexander Varwijk

October 16th, 2025

DrupalCon Europe

Presentation

My goal for this talk is to get you excited about asynchronous programming with Drupal.

There's still a little way to go before Drupal can support what I'm talking about today. Particularly some work needs to be done in the Drupal Kernel to make way for the event loop. Help me execute my Proposal: Restructuring Drupal's internals to make that happen.

The following resources are referenced in this presentation:

Recording

Slides

Let's look at Thinking Async

My goal in this presentation is to get you excited about the possibilities of async Drupal and to help me make this a reality. Scan the QR code on screen to view links to the things I talk about as well as my roadmap for asynchronous Drupal.

When we look at the word "asynchronous" then it literally means "not synchronous". So before we dive into async, we should look at synchronous programming, to understand what we're no longer doing.

I'll start with a live demo. I have here a pack of cards that has been lightly shuffled. I'm giving this to a volunteer on the front row, consider them my data source.

Could you please retrieve the Ace of Hearts?

[I wait until the person has retrieved the card and take it from them to show to the audience]

As you could see, I requested something from my data source and then stood there waiting for the result. That is the behaviour of a synchronous program, it does only one thing at a time and waits while that thing is happening.

We can draw this as a simplified diagram. We start on the left and make a request to our data source. There we wait until we get some result and then we're done.

If we draw this as a timeline then we can see that our application hands off control to the data source and just sits there waiting for the data source to be done.

Thankfully in our demo everything went successfully. However, an important part of a programmer's job is to take error handling into account.

If an error would occur in a synchronous program, then we would likely throw an exception and switch to an entirely different execution path. Quite disruptive.

Keep that in mind as we start looking at asynchronous programs.

You may be wondering, this all works fine. Our databases are fast enough, so why should we even care about async?

Even though for our current types of applications synchronous programs are fast enough, this model places constraints on what we can build.

As the complexity of applications grows we're processing more and more data. Additionally imagine that our data source was some kind of external server rather than our own speedy database. At that point our application would just be spinning its wheels waiting for the external server to complete.

Another current constraint is that Drupal can only handle a single request at a time. We also can't handle long running tasks such as WebSocket connections because we'd only be able to process data for one connection at a time, while our other connections are blocked.

Asynchronous processing will enable us to gain performance improvements in our current single request applications. It'll also allow us to build applications that handle multiple requests at the same time. Finally, it'll allow us to build entirely new types of applications, such as those handling persistent connections or doing continuous processing of data, like in AI workflows.

I hope that gets you as excited about asynchronous Drupal as it does me

Let's now do our asynchronous demo!

Rather than a single deck of cards, I now have three deck of cards. Each lightly shuffled. I'm going to hand these to three volunteers in the audience. In order I'll ask them to retrieve the Ace of Clubs, Ace of Diamonds, and the Ace of Spades.

Rather than waiting for them to be done though, I'm going to ask them to promise me they'll do the task and give me a result. Also remember how our error handling would've gone in the synchronous application? It'd be very disruptive if something would go wrong and they just shouted an exception through the room, disrupting the work of others. Instead I'll give each of them an envelope and ask them to place the result in that envelope.

[I wait for the participants to find the card or the error message and collect the three envelopes]

Let's take a look at the results. I've received here the Ace of Diamonds from the first envelope. The second envelope contains the Ace of Clubs, and the third envelope should contain the Ace of Spades, but instead I have a note here. It looks like we had an error and I've received a "404 Not Found" exception, rather than the card.

Lets evaluate the demo in the same way we've done our synchronous demo.

Again if we draw our process just now as a simplified diagram then you can see we start on the left. Rather than a single data source we now have multiple.

I no longer waited for the data source to be finished, but instead I was able to loop around to all of them and allowed multiple things to happen at the same time. Only once they were all done did I move on to review the results.

We can again draw this as a timeline too. This time the application started up a small loop that communicated to each data source. We could start each of them and only then did we have to wait before we were able to collect results.

Importantly though, we were able to put multiple data sources to work at the same time.

Let's revisit error collection. This time we actually saw something go wrong in one of the data sources. However, rather than throwing an exception and disrupting the entire flow of the application our output value changed. I no longer got back the card directly, but it was now wrapped in an envelope that helped me decide how to deal with errors.

We've seen that we can do multiple things at the same time but in order to be able to coordinate this we need to ask our functions to change the return value they provide to us.

Let's see what that has to do with PHP's introduction of Fibers.

The change in flow and return value becomes most clear when we put the two diagrams next to each other.

We can make this even clearer when we turn the diagrams into some example code.

On the left we have our default synchronous implementation. Of course nothing goes wrong. However, if something went wrong then we would change our flow and throw an exception to indicate the problem.

Assuming everything goes correctly then we return an integer result for our expensive calculation.

On the right side we see our asynchronous equivalent. This time in case something goes wrong we'll have to reject our promise, we couldn't fulfill it to provide a result. If everything goes well then we'll fulfill the promise and have it contain our result.

We can see that the return type of our asynchronous function is no longer an integer but it is now a promise.

In software development, this difference in return type is called a differently colored function. You can imagine that the user of our function will need to know what it will get back from our function. So it'll need to know what color our function is.

In order for a function to use our asynchronous function and not itself be synchronous, it'll also have to adjust its return type and start working with promises rather than direct returns and exceptions.

If we look at the difference in functions then I hope you can now see that adopting async in this manner would be a huge undertaking for existing codebases. This is one of the reasons why Drupal hasn't adopted asynchronous programming in the past.

We'd have to go through the entirety of Drupal and convert all of its code and all of the contributed modules from the synchronous form to use promises.

That would be very disruptive and lead to a lot of breaking changes.

Thankfully in PHP 8.1 we got a new tool to help us overcome this problem in the form of Fibers.

Because Fibers is a language level concept, PHP can make decisions about how it might change how our code works, without us actually writing different code.

That allows us to take a synchronous program like shown in the diagram.

And then run some asynchronous code below it.

Granted, this is something we could already do, by making our synchronous code deal with the promise of our asynchronous code below and then remaining synchronous and blocking ourselves.

However, using promises in that manner would not be very useful because it would still limit what kind of asynchronous tasks we can do next to one another on a high level.

The magic of Fibers is in what it allows us to do above synchronous looking code.

By changing how we call code at the top level and wrapping our individual tasks in a Fiber, we can call synchronous looking code that at the very bottom may be performing asynchronous tasks and may be able to let other things happen while it waits for a result.

This allows us to make our application asynchronous, without having to rewrite the entirety of our entire application.

Thus, what Fibers allows us to do is to build an asynchronous sandwich without changing the code in the middle.

One way this might happen in Drupal is that it we could make the entry point of your application, the request handler, asynchronous. Any code in the middle, which is where your module code lives could remain synchronous, but the lowest level code, for example the entity API might become asynchronous and be able to do multiple things at the same time.

It’s important to note that our new asynchronous sandwich contains two loops.

Loops in programming are very useful. However, we must be careful when using loops as well. For example, the inner loop might do a lot of asynchronous processing, but if it just runs until its own tasks are complete then it might still prevent a lot of other work from happening at the same time, which the upper loop might be controlling.

We can take a look at what this looks like in code. The fact that asynchronous programming is difficult became clear when we were working on the initial implementation of making placeholder rendering async.

If we go through this code from top to bottom, then we first plan our tasks. Wrapping each individual task in a separate Fiber. That lets PHP know that this is its own stack of function calls that it can manage.

Once we've planned all of our tasks we'll need to manage these fibers in our coordination loop.

While we have fibers that are not complete, we'll loop through each of the fibers. We first must make sure the fiber has started executing. If it has started executing but has suspended in a previous iteration to indicate that other work might happen, then we resume it to let it do some more work.

Then we check whether it has terminated. The fiber may be done working in which case it's terminated, but it may also be suspended, waiting for something else.

If it's not terminated we'll go around the loop once more, but if it is terminated then we'll look at the result and replace the placeholder with the output of the placeholder.

We then need to remove the fiber from our queue as we no longer need it.

This is the top level loop in our previous diagram with the code in our fiber having a potentially lower level loop for a data source.

However, we should be mindful that our renderer code may not be at the top of our application. Drupal may want to do other things if our renderer has no work to do.

We van visualize this as nesting our two loops inside yet another coordination loop.

The while loop we implemented didn't actually hand off control to anything else though and just kept iterating until all the fibers were complete.

To play nicely with other tasks that may need to happen outside of the renderer we had to implement some additional code.

You may have spotted it already.

Most of our code remains the same, we plan tasks, we start our loop to manager our fibers. However, inside of our loop we now add some more code.

Every time we look at one of our fibers and it's done some work but isn't complete yet. Before we move to the next fiber in our foreach loop we'll check if we are ourselves within a fiber context. If that's the case then rather than continuing to the next fiber immediately we'll suspend the fiber we're in ourselves. This delegates control back to the higher level loop and allows it to do other things.

Once our higher level control loop resumes the fiber we are in then we'll pick off processing right where we left of and continue processing the next fiber.

The rest of our code remains the same.

That's what it takes to implement a manual coordination of making sure all Fibers are executed without caring about the order and making sure other things can happen at the same time. Pretty complex.

The renderer is probably the simplest way to perform asynchronous tasks. We have a bunch of tasks and we want to process all of them. If any of them have an error, then we stop the processing and abort. We can’t complete the request at that point.

However, there are other types of async workloads that we can perform. Each of them will require slightly different patterns to be applied using Fibers.

Let's take a look at the what it might look like to race some tasks while making sure that we don’t block any other work that might be going on around us.

We read this code from bottom to top.

We start by creating our list of tasks. We need to make sure each of these are in a Fiber since they may need to wait for a data source.

Once we have our list then we'll wait until the first result is done.

If we look at our await first function then the first thing we do is start all of our fibers. This makes sure they can do some initial processing. They may already complete in this moment but if they need to wait for anything externally then this gives them the chance to start waiting.

The next part of this is that we loop until we're done through a return.

We'll loop through our fibers and if they're suspended then we'll give them some more processing time. If any of the fibers have terminated then we're done and we can return their result (or throw an exception from this point).

If we've looped through our fibers and none of them were done yet then we'll check if we should let other tasks in our application run by suspending ourselves. If we're the only task being executed (i.e. we're not in a fiber) then we sleep for a little bit to make sure we don't heat up the CPU.

This looks pretty simple to implement ourselves, but there's really some things missing from this.

Once we have a result, then the other tasks are still executing in the background, we don't do any clean up such as aborting open HTTP requests to reduce load on our and external servers.

If the first task fails then our entire process fails. It could happen that we're offline and our HTTP request immediately fails, but we may have had another task looking at our cache and could've gotten the result from there, just a bit slower.

Similarly if all our tasks here take a long time then we have no way to set a maximum execution time and try something else instead.

Implementing those things would make our code vastly more complex.

Let's take a short look at what we've covered so far.

Using Fibers is great but it does cause a few problems.

Rather than making sure we write our own loops correctly everywhere. What if we could create a single loop that does all the scheduling for us.

This demo is inspired by a great 15 minute presentation about the JavaScript event loop by Lydia Hallie. I've adapted this to PHP and tried to compress it to a few minutes. Her talk is well worth watching.

If we look at our application then we start with some set-up.

We'll start our websocket server.

Which creates a new websocket.

Then we'll tell our Event Loop that whenever a new message comes in on our socket we want to call a callback.

Our Event Loop understands this and registers our callback to the list of callbacks.

We can then unwind our callstack and move to the next stage of our application.

Now that we've done our set-up

We can start our EventLoop.

A message has come in on our websocket and our event loop has scheduled our callback as a task to be executed.

Since we're not working on anything else we can start working on our callback task.

To respond to the websocket message we need to load some data.

This data loading cannot finish immediately and suspends. Our event loop will park our current task with its full callstack.

In the meantime, another message comes in on our socket.

Since we're not working on anything, remember, our data loading is suspended, we can pick this task up immediately.

This task is easier and we can send back a reply immediately. Completing the task.

In the meantime, our data loading has finished so...

... our suspended data loading is now put on the queue as the next task to pick up.

Since we finished our previous task we can pick this up and process it. We send our loaded data back on our socket and that completes this task.

Unwinding our callstack.

We've been able to use this loop to schedule our different tasks and let it coordinate our asynchronous workloads. We saw in our load async data function that it could tell the loop it had to wait for something else and in the meantime other things could happen.

As you've seen, having a single loop can simplify our application because it means we only need to describe our tasks rather than having to take care of scheduling ourselves.

As a trade-off, the different parts of our application do need to be aware of how to interact with this loop. So there must be a single loop per application.

As with any component of which there must be one we have a choice as Drupal framework. We can either build our own or we can use an existing one.

In this case I'm happy that as Drupal we've adopted the Revolt framework, which was created by the maintainers of the popular AMPHP and ReactPHP libraries and is maintained by one of the PHP contributors that built Fibers.

This ensures that our implementation will have compatibility with asynchronous libraries that are out there and ensures we stay off the island.

Here's a code example of what using the Revolt event loop might look like.

We no longer use Fibers directly, but instead we describe what our application should do.

We start with telling our event loop that we're going to schedule some async tasks and request a suspension. Then we'll schedule a task that repeats each second and prints some text.

We'll then schedule a task that should execute after a five second delay. The task will print some text, cancel our repeating task, then resume our suspension, and then print some more text.

Once that set-up is complete we print some initial text and tell the event loop we'd like to wait for our suspension to be resumed. Finally we print script end.

Note the order of the output here.

We see that the last line of code in our delay callback. Was printed...

before the script end.

Thus calling suspension resume does not immediately start executing the code after the suspension. Instead it lets the current code keep executing, but it signals that once this callstack unwinds that the suspended code can resume.

On this slide we already see a few of the primitives that Revolt has to describe async tasks. I quickly want to run through all of them.

The first primitive is the easiest. Defer allows you to schedule a potentially async task to happen once the event loop runs.

We can see this when we look at the output. Although our program is in the order of 1, 3, 2, our output is neatly printed as 1, 2, and 3.

The next primitive is similar to defer, but rather than executing the task immediately, the task is delayed by a number of seconds.

We first set the timezone to where we currently are, in Vienna.

Then we schedule our task three seconds from now and output the current time.

Finally we kick off our event loop.

We see our first output immediately and three seconds later we see our delayed output, before our program terminates.

The next primitive that Revolt provides us allows us to repeat tasks. We repeat every tenth of a second and we use a static variable as counter.

Each iteration we echo "tick" and once our loop as iterated three times when we use the callback ID that we're passed by the event loop to cancel our repetition.

Once the set-up is complete we run our event loop to execute the program.

We can see that the output is tick three times before the program exits.

Another primitive that we've seen in the example of explaining how an event loop functions is the onReadable primitive.

This will allow us to provide a callback whenever there is new data available on the socket.

In this case we read from the socket and then parse the incremental data if we have some. If we don't have any incremental data then we check if the stream is still alive. If the socket has been closed then we cancel our callback from the event loop.

There's no output once we run this example.

The final primitive is the signal primitive. This allows handling signals from the operating system. It's most useful when creating command line applications such as Drush.

We start by using the repeat primitive to print a tick every second with the current date and time.

In this case we don't get a callback ID from the repeat.

Instead we implement a signal handler that will listen to SIGINT, which is the signal that the operating system sends when you press ctrl + C. When that signal is received our callback will print some text and then exit the application.

Once set-up is complete we run the event loop.

In this case I let the application run for three seconds before I pressed ctrl + C.

You can see that the current date time is printed three times before my interrupt was sent and exited the program.

Now that we know the primitives that the Revolt event loop provides us we can look at how we can use that within Drupal.

We'll do this by looking what a rewritten event loop would look like. Just as a reminder, here is the current code of the renderer that uses Fibers.

It does its setup, then manages the fibers, making sure to delegate processing to any upper level loop and once complete it'll use the result of the fiber invocation to replace the placeholder.

This is what that code looks like when we rewrite it with the event loop. It's a lot shorter.

We start by creating a list of tasks. These are still callbacks, but we don't wrap them in fibers anymore.

Once we have our list of tasks we'll pass them to the stream function and then we'll iterate over the results. Once we have a result we check whether we have an error and abort if that's the case. If there was no error then we process the result as we did previously.

This looks a lot more readable. Though I did cheat a little bit because the heavy lifting of scheduling our tasks on the event loop now happens in the "stream" function.

However, it's important to note that this function is not unique to our renderer but can be used for any list of tasks where we don't care about the order of the result.

We can look at the implementation of the stream function to see how it works.

You can see that the first argument is a list of tasks or "operations". We then ask the event loop for a suspension because we know that when the operations are running we'll have to wait for results.

Next we loop through all of our operations and we tell the event loop that we want to do some deferred execution so that it may happen asynchronously.

Within our deferred callback we implement error handling. We'll try to execute the operation and get its result. If everything works correctly then we will tell our suspension that whatever is waiting can resume, providing the initial key of the operation and a success result with the value.

In case of an exception we catch the exception and then resume our caller with the key and an error result instead.

By catching the exception and using a specialised Result class we give the caller of our "stream" function the choice of how to deal with successes and failures, rather than forcing a single error handling strategy on them.

Once all of our tasks are scheduled we'll have to tell the event loop that we need results. In this case we don't care about the order of the results, so we can use the same suspension every time.

However, we must suspend the same number of times that we started tasks so that we can be fed each of the results.

When we get a result from our suspension then we use PHP's yield function to provide the original key and the result to our caller. This allows our calling code to iterate over the results using a foreach loop and to associate results with the key they set for the task, even if the results are out of order.

That stream function is already nice and reusable but it only works for results where we don't care about the order. It also doesn't really allow for any cancellations.

Coming up with how this should work isn't easy and thinking of all the different scenarios isn't either.

Thankfully we don't need to write these kinds of functions ourselves but can instead reach for existing libraries such as AMPHP.

Drupal may implement some of its own primitives rather than adding another dependency, but for your own projects I highly recommend reaching for an async library.

AMPHP is a collection of asynchronous libraries that are compatible with Revolt. It has many more libraries than are on the slide here but I want to highlight a few.

If we look at the fundamental package, amphp/amp, then this contains base primitives for task coordination.

You can see that the last argument to each primitive is a cancellation ID. This allows us to implement a cancellation strategy using Amp's cancellation primitives.

For example we can set a timeout, or we can cancel on a signal such as SIGINT. We can also cancel based on some other deferred operation or never cancel using the NullCancellation. We may also wish to cancel with a combined set of strategies using the CompositeCancellation strategy. For example have a timeout but also be mindful of the user pressing ctrl + C to send an interruption signal.

That was a lot of information, so let's summarise what we discussed in the last 40 minutes.

I hope that got you as excited about async Drupal as I am. Find the links on how you can help out and my roadmap article by scanning the QR code.

https://www.alexandervarwijk.com/blog/2025-09-08-proposal-restructuring-drupals-internals