Async series: introduction
Contents
This article is the first of a serie of articles about Async Programming and Reactive Programming and its use in JavaScript/TypeScript.
Async programming is a fundamental part of developing a modern web application. In today’s web, frontend and backend applications pull or get pushed data from various external and internal sources and deal with many events. For example, an application usually has to deal with streams of data such as:
- REST or GraphQL API calls' responses
- WebSocket messages arriving continuously
- streams of events generated from user interaction with the page
- internal data from application’s services/stores that transit between services and components
With all those sources of data comes a set of challenges to deal with it: how to cleanly treat and combine these data sources in terms of code organization / cleanliness / testability and how to deal with the performance bottleneck they can cause.
Async Programming is necessary in order to be able to perform actions in response to events happening outside of the main flow of execution of the program. In Javascript, the tools to manage async programming have evolved the last decade and we’ll see the different ways to code in an async way in this serie of articles: callbacks, Promises, async/await and finally Reactive Programming.
Synchronous vs Asynchronous code
In imperative paradigm, the code is executed in the order it is written. The processor executes the code line by line (that’s a simplification, your code might be optimized by the compiler/runtime/virtual machine), in sequence. That was ok for a very long time but the advent of network computing turns it into an issue. A program that needs to gather data from outside its environment has to deal with latency. Making an http request or calling a remote procedure is not the same as calling a local function, even if the service runs on the same machine, it is much more expensive in terms of time.
In synchronous fashion, the runtime will wait for the result of the function to move onto the next line of code (which probably rely on this result). If we are calling a local function, it’s ok because the runtime executes the local code and makes use of the processor. But if the function is a network request (http query, remote procedure, etc.), then the expensive code is executed on another machine. If we block while awaiting the result, then our processor is idle and not working on anything else for the time of the request during the time slots allocated to this process by the CPU scheduler. That can be anywhere from some milliseconds up to full seconds: a processor can execute a lot of code in this timespan. If we block, we misuse our processor, not only in our program but for the other programs executing on the same processor that could get more CPU cycles to execute their own tasks.
Therefore there is a need to perform non-blocking actions: if an action is asynchronous (meaning its result involves some latency) then the processor should be able to execute other things in the meantime, and when receiving the results, resume its execution of the program: this way we avoid losing lots of CPU cycles. This is the concept of asynchronous/non-blocking code.
Note: The terms asynchronous and non-blocking are not strictly synonyms and their usage is not consistent across technologies but to simplify we’ll use them interchangeably.
In order to perform non-blocking operations, the runtime must support it. It needs to know that some calls (mostly system calls that relate to the filesystem and the network) do not return immediately and have some way of “being notified” (e.g. could be a kind of message queue) so it can resume where it left off when the result is available.
Pain points introduced by asynchronous code
Async code is not as straightforward to think about and write as synchronous code. It introduces challenges and pain points to the developer.
Not all code execution is sequential anymore
The friendly sequential model of code execution we enjoy in imperative synchronous code is not applicable. In asynchronous code, the next line of code you see in your editor might be executed much later than the previous one or almost immediately, maybe once, maybe several times or maybe never.
|
|
In this example, the runtime executes myFunction
then attaches a event listener to the DOM element with id myId
then executes myOtherFunction
. But the function parameter of addEventListener
(the callback function) is not called in sequence: it might be called once or several times or never depending on if the user clicks on the element once, several times or never, and its rate of execution varies according to the user’s actions: it could be executed several times per second or once every few minutes: you can’t really say when writing the code (although you can estimate: a user’s click rate is limited by the physical action of pressing the mouse button so it won’t be more than a few times per second, to the contrary a scroll event emits lots of time per second while scrolling).
The “callback hell” or “pyramid of doom”
In typical async style where a callback function is passed as parameter, you introduce one more level of nesting and indentation with each callback. Which is fine for a simple case. But often in production code you’ll have to call an asynchronous function within the callback passed to an async function: you now have two levels of nesting and indentation. And it is not rare to have 3, 4 or more of those nested calls. The code becomes hard to read and debug: this is the “callback hell”.
|
|
In this example, each function is non-blocking because they deal with I/O (the filesystem, or database, or network query, etc.), each time a callback has to be passed. The code becomes deeply nested and displays a pyramid shape hence the nickname “pyramid of doom”. You can of course write code in a way that limits the indentation issue and favors the readibility. But it takes discipline and can be more difficult when your callbacks are closures (meaning you want access to the variables defined in upstream callbacks).
How to write asynchronous code?
Over the years, the way to write asynchronous code has evolved. Abstractions have been introduced to ease it. In this article, we’ll show the first and oldest one: passing callback functions. In following articles we’ll see Promises, async/await and Reactive Programming.
Passing callbacks to asynchronous functions
You can see this solution is the previous examples: an async fonction performs an action (e.g. a query) and in order to do something with its result, we pass it a function that takes as argument the result. In order to do so, the language has to support for first-class functions, meaning that functions are treated like other data types and can be assigned to variables and passed around. This is useful in conjunction with higher-order functions. A higher-order function is a function that can take a function as an argument or return a function (or both).
Such higher-order functions enable important functionalities:
- they allow to implement some design patterns such as the Strategy Pattern without all the boilerplate code necessary in languages that don’t support higher-order functions
- they allow to easily create “configured” or “parametrized” functions through the use of factory functions
- more generally, all the great things of Functional Programming rely on higher-order functions
Let’s see examples:
|
|
Promises
Promises were first introduced in Javascript by 3rd party libraries (such as JQuery or BlueBird) before getting native support with ES2015 specification.
A Promise
is a placeholder object for a result that is deferred. A promise can be in one of three states: fulfilled
, rejected
or pending
. It has a then(successCallback)
and a catch(rejectCallback)
methods available. The Promise is initially in pending state then goes to fullfilled, in this case successCallback is executed, or goes to rejected and rejectCallback is executed. then
and catch
return Promises so the callbacks can be chained.
|
|
The Promise interface makes the code more readable. It allows to separate the successCallback and the rejectCallback. Also the calls that would normally be nested are now on the same level. The same code with callback passing style would be:
|
|
However Promises are not silver bullets, they don’t solve all the problems.
Promises only resolve one value, so they cannot be used to model a stream of data. Once a Promise is resolved or rejected, it is its final state, the value it holds won’t changed. This is ok for a call that returns one value (such as an http query to a REST API or reading a file content one time) but it cannot be used for repeatable events such as DOM events, incoming WebSocket messages, GraphQL subscription messages, etc.
If you have to manage several Promises, some boilerplate code may appear and make your code less readable. Let’s see an example where we call one API to get a list of order ids, then another API to get the orders' details one by one and finally another call to get the customer’s details (this is a ficticious example, you might have an API that provides you everything in one call).
|
|
async
/await
The async
/await
syntax was introduced in ES2017. It is mainly syntactic sugar to make working with Promises easier and more intuitive.
An async function is one that contains one or more await
expressions. await
expressions are marked with the keyword await
, they suspend the progress of the async function while awaiting for the Promise to be resolved (either fulfilled or rejected) and yield control so that the runtime can execute other code.
Async function must be marked with the modifier keyword async
. It can be either a function declaration or an expression:
|
|
Async function returns a Promise: you do not have to explicit return of Promise, the return result will be wrapped into a Promise. You can either use the returned value as a Promise with the regular Promise API (.then()
, .catch()
, Promise.all()
, etc.) or use the await
keyword and have the Promise result unwrapped for you:
|
|
The advantages of the async
/await
syntax are:
- it brings back sequentiality in your code: rather than the function returning immediately and a callback function being executed at a later point in time, with an async function, the runtime waits for the result and when available unwraps it and continue executing the function body
- it frees us from the callback hell and having too much indentation level
- it enables the use of
try
/catch
block: when usingawait
, a rejected Promise will translate into an Exception:
|
|
However the async
/await
doesn’t abstain the developer from understanding Promises and async programming in general. For more complex use cases, the use of Promise reappears clearly in the code.
First, let’s make a function that returns a Promise which resolves after a configurable time:
|
|
Now, let’s compare some different cases.
Sequential execution
|
|
Concurrent start
What if we both both Promises' resolution to run concurrently, and then execute some action in a specific order?
|
|
Parallel executions
What if we want all Promises' resolutions to run concurrently but performs action on all results when all have been resolved?
|
|
|
|
Same as above but we also want to perform an action on each Promise result as soon as it’s available.
|
|
|
|
You can also mix the async/await syntax with the Promise syntax for more readability:
|
|
A different paradigm: Reactive Programming
In the next article, we’ll see Reactive Programming.