Think 2021: New tools have the developer ecosystem and IBM building together Learn more

Deploy Node.js securely: Avoiding arbitrary code execution vulnerabilities when using Node.js child process APIs

Arbitrary code execution is when an attacker can convince a target to run arbitrary code not intended by the target’s author. When done remotely, it’s called remote code execution, and it can be a devastating attack against an online service.

Arbitrary code execution with the Node.js child process APIs

Last fall there was a stream of security issues reported against Node.js packages in the ecosystem that arose from package authors not paying close enough attention to this warning, repeated throughout the Node.js documentation for child_process:

If the shell option is enabled, do not pass unsanitized user input to this function. Any input containing shell meta-characters may be used to trigger arbitrary command execution.

Examples of such meta-characters are ;, $, %, and new line (NL) characters. The exact set, unfortunately, depends on the shell being used.

The issues followed a common pattern: an npm package with a JavaScript API was taking user-provided arguments and passing them on to child_process APIs assuming they were simple strings. A few of the reports involved calling the git command line utility from a child shell (“shelling out”), so lets use it as an example and walk through the process to publish a package containing a full remote code execution vulnerability.

Take note, none of this is specific to git, any command line tool (ls, imagemagic, …) could be abused equivalently.

git-url: Do not do this!

Let’s write some code to find the URL of one of the remotes for a Git repository:

const exec = require('child_process').execSync;

function url(remote) {
  return exec(`git remote get-url ${remote}`, {encoding: 'utf8'})
}

console.log('URL:', url('origin'));

As it stands, this code is unremarkable; feel free to use it in some project’s automation if it’s useful. However, it can easily become exploitable. I’ll show you how.

Let’s refactor it a bit. We’ll transform it from a bit of local code into a small package publishable to npmjs.com (because small packages are great).

The package index.js file will become:

const exec = require('child_process').execSync;

function url(remote) {
  return exec(`git remote get-url ${remote}`, {encoding: 'utf8'})
}

module.exports = url;

Assuming we were actually going to publish this (which I will not!), it would need a name, for example it could be called @octet/git-url. Users of the package would then be able to do npm install @octet/git-url and then write code like the below:

const url = require('@octet/git-url');

console.log('URL:', url('origin'));

At this point, nothing much seems to have changed beyond a simple refactor of a function into a reusable package. Though seemingly insignificant, the refactoring created a reportable vulnerability.

A user of the URL package should not have to know that when this package is called with a remote value of 'origin; touch I_P0WN_YOU', the actual command that executes is:

git remote get-url origin; touch I_P0WN_YOU

Still, we could consider this to not be a big deal. Nobody would do that by accident.

Perhaps not, but what if some user of this package decided to build a small web application that would accept a string as a user-input and pass it to this package to find the remote URL?

That might seem a bit far fetched for this specific package, but it is possible. If it happened, that web app would give any arbitrary user the ability to run any shell command they wanted on the server. They could copy the password file back to themselves, delete every file the web app had access to, exploit a kernel bug, and get root access. All kinds of nastiness could occur.

This kind of usage of child_process is not acceptable in published packages.

Enough warnings, how to handle this

Now that you know what could happens, let’s look at how to avoid arbitrary code execution. Simply put, don’t let your packages become reported as containing code execution vulnerabilities. And, if your packages accept a user-provided string, make sure it contains no unexpected characters. But how to do that?

I suggest that instead of searching for unsafe characters, robust code should explicitly limit strings to a known safe set, unless there is strong reason to allow any others.

For example:

const exec = require('child_process').execSync;

const safe = /^[a-zA-Z0-9-_]*$/;

function url(remote) {
  if (!safe.test(remote))
    throw new Error('remote is not safe');

  return exec(`git remote get-url ${remote}`, {encoding: 'utf8'})
}
module.exports = url;

Why doesn’t Node.js deal with this for us?

While Node.js could probably do a better job of documenting the shell special characters, this process is difficult because there are multiple possible shells even on a single system (think csh and bash on Linux, or cmd.exe and PowerShell on Windows). Worse, one shell can have multiple names, so Node.js cannot know which shell is being used. Plus, if the set of specials is too restrictive, they would forbid characters that are are non-special for some shells.

It’s not clear how to deal with this robustly and consistently across platforms, so a solution hasn’t yet been found. If you have any suggestions, PRs are welcome.

Real-world examples

One of the advantages of responsible disclosure of vulnerabilities is that when the reports eventually become public (after the vulnerabilities are fixed), we can all learn from them.

Here are a couple examples of issues reported to the Node.js Ecosystem Working Group via Hackerone.

Remember, these aren’t intended as examples of mistakes others have made. Consider them carefully as examples of mistakes any of us could make (and perhaps already have). Every use of the child_process API should be examined to make sure no unsanitized user input is being passed. Remember, check not just your code, but all of your dependencies.

Do this check on your packages before someone else does!