Understanding Node.js - Event loop
Introduction
This article is the first in a series exploring the architecture of Node.js, with a focus on threads and memory management. Throughout this series, we will examine the threading model, memory management techniques, and built-in profiling tools available in Node.js to optimize application performance.
In this installment, we’ll concentrate on thread management, addressing the question of how Node.js efficiently handles multiple requests despite its single-threaded architecture.
Event Loop
Node.js uses the “Single Threaded Event Loop” architecture to handle multiple concurrent clients. Its processing model is based on event-driven architecture with a callback pattern.
The entire "Event Loop" architecture in Node.js is structured around macro and microtask queues. The macrotask queue consists of several main phases, each with distinct responsibilities and designed to handle specific types of tasks. Complementing this, the microtask queue is processed at the end of each macrotask phase, allowing for efficient task prioritization.
Each phase has a FIFO queue of callbacks to be executed. The time a thread spends in each phase depends on factors such as the number of tasks in the queue or the maximum allowed time to be spent in the phase.
When a task like network communication, file read/write or database access is initiated, Node.js does not wait for the operation to complete. Instead, it delegates the task to the system’s kernel or utilizes additional worker threads through the Node.js libuv
library, which is responsible for asynchronous operations. The event loop can then continue processing other code, using the main thread.
The high level architecture diagram looks following:
This architecture enables Node.js to handle multiple requests on a single thread with high efficiency and performance.
Macro tasks
Timers phase
This phase handles the callbacks for timers that were scheduled using the setTimeout
and setInterval
functions. A timer specifies a minimum delay before the callback can be executed, but it does not guarantee an exact execution time. The callback will be triggered as soon as possible after the specified time has elapsed, depending on the state of the event loop and other phases' queues, as well as the operating system.
There is also another timer function, setImmediate, but it is not handled during the Timers phase.
Pending Callbacks phase
The purpose of this phase is to handle I/O-related callbacks that were deferred to the next cycle of the event loop. This phase processes callbacks for specific types of system operations, particularly those involving non-blocking I/O, such as errors from file system operations or network requests.
Idle phase
This phase is not related to execution of application code, but rather for Node.js to perform some memory housekeeping or other system related tasks.
Poll phase
The purpose of this phase is to manage a queue of I/O-related callbacks. It also determines how long to block and poll from the queue. When the queue becomes too long, causing the scheduled timers to be significantly delayed, the process may interrupt the handling of pending callbacks to allow other phases to proceed.
If all the callback from the queue are processed, two things may happen:
- There are callbacks scheduled with
setImmediate
- it will leave the poll phase and enter check phase to handlesetImmediate
callbacks. - There are no callbacks related to timers - it may stay in idle state waiting for new I/O operations.
Check phase
This is a special phase to handle callbacks scheduled with setImmediate
timer.
Close Callbacks phase
The Close Callbacks phase is the last phase in every event loop iteration. It is responsible for cleaning up resources that were closed unexpectedly, such as when socket.destroy() is called or when a file stream closes due to a read/write error. Callbacks registered for these events (often through event listeners like 'close') are executed in this phase and will not be used in subsequent cycles.
Close callbacks are commonly associated with network sockets, file streams, and other resources that require proper cleanup to prevent resource leaks and ensure efficient memory usage.
Micro tasks
Microtasks are a specific type of task that are handled by the main thread in JavaScript. They are stored in a special microtask queue and are executed at the end of the current script execution, before the event loop moves on to the next phase (which could be handling I/O operations, timers, or other macrotasks).
Microtasks are typically created in response to promise resolutions (using .then()
) or rejections (using .catch()
). When a promise is resolved or rejected, its associated callback is queued in the microtask queue, ensuring that it is handled immediately after the current execution context is completed but before any other tasks are processed.
Microtasks are designed to execute small, specific tasks that need to be completed before moving to the next phase of the event loop. This helps maintain a responsive and efficient execution flow in asynchronous programming.
Examples
setTimeout vs setImmediate
setTimeout
and setImmediate
are timer functions designed to run a callback at a future time. The setTimeout
callback will be executed once the specified delay has elapsed. setImmediate
, on the other hand, is designed to execute the callback immediately after the event loop's current poll phase completes.
The execution order of these timers can vary depending on the context in which they are called. If both functions are invoked within the main module, the execution order is not deterministic and may vary based on CPU performance or the system's overall workload. However, if they are called within an asynchronous context (e.g., inside an I/O callback), setImmediate
will consistently be executed before setTimeout
.
Create an empty index.ts
file and run the following code several times:
setImmediate(() => {
console.log("hello from setImmediate");
});
setTimeout(() => {
console.log("hello from setTimeout");
}, 0);
As expected, the result is not deterministic and we can see hello from setImmediate
and hello from setTimeout
in random order.
However, if we change the context and run the code inside async context, e.g.
async function asyncTimers() {
return new Promise((resolve) => {
fs.readFile(__filename, (error, data) => {
setTimeout(() => {
console.log("hello from setTimeout");
}, 0);
setImmediate(() => {
console.log("hello from setImmediate");
});
resolve(true);
});
});
}
asyncTimers();
the result always would be:
hello from setImmediate
hello from setTimeout
This is because both timers are scheduled during the poll phase. Immediately afterward, any setImmediate
callbacks that were scheduled are executed.
Microtasks
We can slightly modify the above code to see how microtasks behaves. Let's create a new function with following content and run it:
async function microtasks() {
return new Promise((resolve) => {
fs.readFile(__filename, (error, data) => {
setTimeout(() => {
console.log("hello from setTimeout");
}, 0);
setImmediate(() => {
console.log("hello from setImmediate");
});
queueMicrotask(() => {
console.log("hello from queueMicrotask");
});
resolve(true);
});
});
}
microtasks().then(() => {
console.log("hello from then");
});
As you may expect, the result is:
hello from queueMicrotask
hello from then
hello from setImmediate
hello from setTimeout
Since microtasks are handled using a FIFO (First In, First Out) queue, the hello from queueMicrotask
callback was added first, followed by the hello from then
callback. They were then executed in that exact order. The other timers were executed as described above.
Summary
This is all about the Node.js event loop. I hope you found it interesting and now have a better understanding of how Node.js handles non-blocking I/O. This is the first in a series of articles where we’ll dive deeper into Node.js architecture. Stay tuned for our next piece, which will focus on performance profiling and optimization.