Developers, testers, and operations staff need to capture and analyse diagnostics data from Node.js applications during development and during deployment. This article covers six common failure scenarios and the tools and techniques you can use to tackle them.

Common Node.js deployment problems

Problems occurring in Node.js application deployments can have a range of symptoms, but can generally be categorized into the following:

  • Uncaught exception or error event in JavaScript code
  • Excessive memory usage, which may result in an out-of-memory error
  • Unresponsive application, possibly looping or hanging
  • Poor performance
  • Crash or abort in native code
  • Unexpected application behavior or functional issue

The approach you take to diagnose the problem depends on the scenario but may also depend on the requirements of the application deployment. The priority in production deployments is to maintain the availability of the application, which usually involves fail-over to a separate application instance and immediate fixing of the failing application.

Techniques used during development, such as attaching a debugger or adding instrumentation to the application, are not usually available in production deployments. The use of tracing or monitoring tools that are not invasive and have minimal impact on the running application might be possible, but the capture of diagnostic information such as logs and dumps at the point of failure is often the most practical approach.

Below we cover the common failure scenarios and discuss tools you can use to spot the problems during development and deployment.

Uncaught exceptions or error events in JavaScript code

Uncaught exceptions or error events in JavaScript code usually result in termination of the Node.js application, with an error message and stack trace written to the console (the stderr stream). For example, the following output is written if a Node.js application using the Node.js File System API attempts to open a non-existent file:

// Node.js example application #1 - uncaught exception
const fs = require('fs');

function FileOpen() {
    fs.openSync('/a/non/existent/file', 'r');
}

console.log('example1.js: Node.js application running');
FileOpen();
 Error: ENOENT: no such file or directory, open '/non/existent/file'
    at Object.fs.openSync (fs.js:652:18)
    at FileOpen (/home/rnchamberlain/test/exception.js:6:8)
    at Object.<anonymous> (/home/rnchamberlain/test/exception.js:11:1)
    at Module._compile (module.js:573:30)
    at Object.Module._extensions..js (module.js:584:10)
    at Module.load (module.js:507:32)
    at tryModuleLoad (module.js:470:12)
    at Function.Module._load (module.js:462:3)
    at Function.Module.runMain (module.js:609:10)
    at startup (bootstrap_node.js:158:16)

The exception stack trace from the application’s stderr output stream may be sufficient to locate and diagnose the problem. If more information is required, there are several modules available that allow you to capture additional information when an exception occurs:

Tooling suggestions for uncaught errors or exceptions in JavaScript

Excessive memory usage, which may result in an out-of-memory error

Excessive memory usage by a Node application is often detected by scripts that use operating system facilities (for example, the ps or top commands) or by production monitoring tools. Sometimes, the application fails as it reaches a limit configured in Node.js or in the operating system. This is known as an out-of-memory error.

An out-of-memory error causes the Node.js application to produce error output and then terminate the application. The following example shows a sample error output.

  <‑‑‑ Last few GCs ‑‑‑>
     7299 ms: Mark-sweep 33.0 (64.2) ‑> 33.0 (64.2) MB, 72.2 / 0.0 ms (+ 14.6 ms in 13 steps since start of marking, biggest ms) [allocation failure] [GC in old space requested].
     7391 ms: Mark-sweep 33.0 (64.2) ‑> 33.0 (64.2) MB, 92.8 / 0.0 ms [allocation failure] [GC in old space requested].
     7485 ms: Mark-sweep 33.0 (64.2) ‑> 33.0 (46.2) MB, 93.5 / 0.0 ms [last resort gc].
     7580 ms: Mark-sweep 33.0 (46.2) ‑> 33.0 (46.2) MB, 95.4 / 0.0 ms [last resort gc].

  <‑‑‑ JS stacktrace ‑‑‑>
  Security context: 0000037D843CFB61 <JS Object>
     2: my_listener [C:\test\heap_oom.js:~9] [pc=00000255FA5B3A8D] (this=000001240487E6E9 <a Server with map 000002302D633F89>,request=000001240487E5F9 <an IncomingMessage with map 000002302D636EF9>,response=000001240487E581 <a ServerResponse with map 00000F79>)
     3: emitTwo(aka emitTwo) [events.js:106] [pc=00000255FA522153] (this=0000037D84304381 <undefined>,handler=000003F40AEE39F

  FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

The information from memory monitoring tools or the error output is not usually sufficient for you to diagnose the problem. It does not provide details of the often complex memory usage inside the application. You must take additional steps to capture more information and use appropriate tooling to analyse it.

Tooling suggestions for excessive memory usage

Unresponsive application, possibly looping or hanging

An unresponsive Node.js application can be detected if you use watchdog facilities in the production environment, or by the application users themselves.

You can initially use operating-system commands such as ps (on Linux systems) to find out if the application is looping (indicated by high CPU usage) or waiting (indicated by low CPU usage).

Tooling suggestions for unresponsive Node applications

Poor performance

Like unresponsive Node.js applications, performance issues are detected by watchdog facilities in the production environment or by the application users themselves. There might be a response-time issue or excessive use of CPU or memory by the application.

Tooling suggestions to determine poor performance in Node applications

Crash or abort in native code

If a Node.js application crashes or aborts in native code, the symptoms are minimal. The application stops immediately, but usually produces at least a simple message on the stdout or stderr stream. The key diagnostic technique is to capture a core-dump. Operating-system configuration settings such as ulimit (on Linux systems) might be needed for a core-dump to be written.

Tooling suggestions for Node app that crashes or aborts in native code

  • Use the LLDB debugger with the llnode v8 plugin: If a core-dump can be captured, the LLDB debugger with the llnode v8 plugin can be used to obtain a native stack trace showing the point of failure and allowing other data such as register and memory values to be obtained. See Exploring Node.js core dumps using the llnode plugin for lldb for more information.

Unexpected application behavior or functional issue

With unexpected application behavior or functional issues, symptoms depend on the application, but generally result in incorrect output, observed during routine testing or later by application users.

Tooling suggestions for unexpected application behavior

  • Built-in Node.js: Use the built-in Node.js --trace*, --log*, and --print* options to obtain more information about the application.

  • LLDB debugger with the llnode v8 plugin: Use the LLDB debugger with the llnode v8 plugin. Capture a core-dump by using an operating-system command such as gcore (on Linux systems). You can use the LLDB debugger with the llnode v8 plugin to investigate data inside the application (Javascript object properties) and to examine JavaScript application code. See Exploring Node.js core dumps using the llnode plugin for lldb and Advances in core-dump debugging for Node.js for more information.