Explore Node.js core dumps using the llnode plugin for lldb

This article is a primer for developers who are interested in using core dumps to help them debug Node.js problems. If you are a C/C++ developer who is comfortable with core dumps, ulimits, gcore and debuggers, then you are probably comfortable with using core dumps. Being able to integrate JavaScript debugging with C/C++ may be useful if you are working on native npm modulues or other C/C++ code that interacts with JavaScript. If you’re a pure Node.js developer wondering if using core dumps is something you should care about or be aware of in case of an emergency, then hopefully this article will provide a starting place.

In this article, I offer a basic demonstration of how you can use llnode to get a better view of how Node.js invokes your JavaScript code.

First, let’s define a core dump. A core dump is a snapshot of a processes memory at a particular point in time. There are many ways to create one, but the most usual is a crash within the process that causes the operating system to create a core dump and terminate the process. Generally, you can access core dumps using a debugger such as LLDB or gdb.

llnode is a plugin for the LLDB debugger that lets it make sense of many of the v8 structures within a core dump. You can use it to help understand what the state of your program was at the time the core dump was generated.

You can find the instructions on how to compile and install llnode on the projects GitHub.

Core dumps

If you aren’t really familiar with core dumps there’s a couple of things you should know about them, although really they are the same thing:

  • Core dumps are a snapshot of your processes entire memory so core dumps are big. Because of that although most operating systems can create one when a process crashes this is often disabled by default. This is the reason they are often disabled, it’s bad if a process crashes frequently and needs restarting. It’s disastrous if you set it up to be restarted automatically but it keeps crashing and fills your disk.
  • Core dumps are a snapshot of your processes entire memory which includes things like usernames and passwords and other things you don’t want to share. Giving someone a core dump of your application gives them access to all the data it had in memory at the time the core dump was created.

There’s several ways to get a core dump. The most well known is the OS generating one when your application crashes. Node.js has a flag –abort-on-uncaught-exception which makes it easy to get a core dump from JavaScript code by forcing an abort when an exception crashes your program and that’s a good way to experiment with llnode but we won’t need that for our examples below. Since core dumps are often disabled to stop crashing applications filling up your disk we should make sure they are enabled. The basic way to re-enable this is to change the ulimit for core files from 0 to unlimited:

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
....

$ ulimit -c unlimited
$ ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
....

Setting the ulimit at the terminal in this way will change the setting for processes started within that terminal session but is not a permanent system wide change.

On Mac OS this should cause core files to be created in /cores. The core file will be named /cores/core.[process_id].

On Linux, by default, it would allow them to be created in the current directory. On Linux there are a number of other configuration settings that can change what happens with a core dump, if they are created, where they are created and what they are called. In particular the setting in /proc/sys/kernel/core_pattern which can change the file name or even pipe the core dump to another tool. You can probably get up to date instructions for your distribution just by Googling “enable core dumps [distro-name]”. A good rule of thumb is to check the syslog after a core dump has been generated as there will often be a message in there telling you where it went.

llnode example

Starting with a simple (and broken) web server:

var http = require('http');
var host = "127.0.0.1"
var port = 1338

server = http.createServer(function myRequestListener(req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n');
  res.not_a_function()
}).listen(port, host);

console.log(`Server process ${process.pid} running at http://${host}:${port}/`);

This will only run once, the exception inside the requestListener function. (I’ve named my request listener as it makes for a better stack trace. I don’t know of any technical reasons not to do that and it does make debugging easier, native or otherwise.)

If you run this with:

$ node --abort-on-uncaught-exception httpexception.js 
Server process 14104 running at http://127.0.0.1:1338/
Uncaught TypeError: res.not_a_function is not a function

FROM
Server.myRequestListener (/Users/hhellyer/work/testscripts/httpexception.js:8:7)
emitTwo (events.js:87:13)
Server.emit (events.js:172:7)
HTTPParser.parserOnIncoming [as onIncoming] (_http_server.js:533:12)
HTTPParser.parserOnHeadersComplete (_http_common.js:103:23)
Illegal instruction: 4 (core dumped)

Luckily the example prints out the process id and that means we know that (on Mac OSX) the core file will be /cores/core.14104, it’s there and it’s fairly big:

$ ls -lrth /cores/core.14104 
-r--------  1 hhellyer  admin   1.2G 11 Aug 15:39 /cores/core.14104

Now we can open this with LLDB. We specify the executable and the core file to use (with -c) and then run the standard backtrace command “bt” to produce a backtrace of our failure as a native call stack. There is also an llnode command called “v8 bt” this does the same as backtrace but tries to fill in any stack frames that LLDB can’t find symbols for by seeing if they are JavaScript frames.

$ lldb node -c /cores/core.14104

(lldb) bt
* thread #1: tid = 0x0000, 0x000000010079f162 node`v8::base::OS::Abort() + 18, stop reason = signal SIGSTOP
  * frame #0: 0x000000010079f162 node`v8::base::OS::Abort() + 18
    frame #1: 0x000000010042cf20 node`v8::internal::Isolate::Throw(v8::internal::Object*, v8::internal::MessageLocation*) + 416
    frame #2: 0x000000010052f78f node`v8::internal::Runtime_Throw(int, v8::internal::Object**, v8::internal::Isolate*) + 47
    frame #3: 0x00001272d7d0963b
    frame #4: 0x00001272d7ea6345
    frame #5: 0x00001272d7d09ff7
    frame #6: 0x00001272d7e99408
    frame #7: 0x00001272d7e1da52
    frame #8: 0x00001272d7e1d57b
    frame #9: 0x00001272d7d09ff7
    frame #10: 0x00001272d7e977bf
    frame #11: 0x00001272d7e95413
    frame #12: 0x00001272d7d318fd
    frame #13: 0x00001272d7d15b62
    frame #14: 0x00000001002daa88 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
    frame #15: 0x0000000100166a84 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 276
    frame #16: 0x0000000100684230 node`node::Parser::on_headers_complete_() + 506
    frame #17: 0x0000000100785707 node`http_parser_execute + 531
    frame #18: 0x0000000100684975 node`node::Parser::Execute(char*, unsigned long) + 85
    frame #19: 0x0000000100684c4f node`node::Parser::OnReadImpl(long, uv_buf_t const*, uv_handle_type, void*) + 107
    frame #20: 0x0000000100693c5d node`node::StreamWrap::OnReadCommon(uv_stream_s*, long, uv_buf_t const*, uv_handle_type) + 127
    frame #21: 0x0000000100793622 node`uv__stream_io + 1235
    frame #22: 0x000000010079a9fa node`uv__io_poll + 1554
    frame #23: 0x000000010078c1bb node`uv_run + 321
    frame #24: 0x0000000100667fbb node`node::Start(int, char**) + 583
    frame #25: 0x0000000100000a34 node`start + 52

The problem with this backtrace is obvious by looking at stack frames #3 to #13. That’s where v8 is executing your JavaScript code. Because it’s running code v8 has generated at runtime and not compiled C/C++ the debugger has way to find symbols (names) for the functions that are being executed and the backtrace produced by “bt” has gaps. If we use the llnode command “v8 bt” to walk the stack instead we get a better result:

(lldb) v8 bt
 * thread #1: tid = 0x0000, 0x000000010079f162 node`v8::base::OS::Abort() + 18, stop reason = signal SIGSTOP
  * frame #0: 0x000000010079f162 node`v8::base::OS::Abort() + 18
    frame #1: 0x000000010042cf20 node`v8::internal::Isolate::Throw(v8::internal::Object*, v8::internal::MessageLocation*) + 416
    frame #2: 0x000000010052f78f node`v8::internal::Runtime_Throw(int, v8::internal::Object**, v8::internal::Isolate*) + 47
    frame #3: 0x00001272d7d0963b <internal code>
    frame #4: 0x00001272d7ea6345 CALL_NON_FUNCTION(this=0x00003c65271041b9:<undefined>) at native runtime.js:1:10 fn=0x00003c652719f951
    frame #5: 0x00001272d7d09ff7 <adaptor>
    frame #6: 0x00001272d7e99408 myRequestListener(this=0x00001090350042e1:<Object: Server>, 0x00001090350cdb69:<Object: IncomingMessage>, 0x00001090350cfde9:<Object: ServerResponse>) at /Users/hhellyer/work/testscripts/httpexception.js:5:54 fn=0x000010903508eca1
    frame #7: 0x00001272d7e1da52 emitTwo(this=0x00003c65271041b9:<undefined>, 0x000010903508eca1:<function: myRequestListener at /Users/hhellyer/work/testscripts/httpexception.js:5:54>, 0x00003c6527104211:<true>, 0x00001090350042e1:<Object: Server>, 0x00001090350cdb69:<Object: IncomingMessage>, 0x00001090350cfde9:<Object: ServerResponse>) at events.js:85:17 fn=0x00001e84e0224b69
    frame #8: 0x00001272d7e1d57b emit(this=0x00001090350042e1:<Object: Server>, 0x0000002b42a10fd9:<String: "request">) at events.js:117:44 fn=0x00003c65271ca4d9
    frame #9: 0x00001272d7d09ff7 <adaptor>
    frame #10: 0x00001272d7e977bf parserOnIncoming(this=0x00001090350c7b51:<Object: HTTPParser>, 0x00001090350cdb69:<Object: IncomingMessage>, 0x00003c6527104211:<true>) at _http_server.js:459:28 fn=0x00001090350c68b1
    frame #11: 0x00001272d7e95413 parserOnHeadersComplete(this=0x00001090350c7b51:<Object: HTTPParser>, <Smi: 1>, <Smi: 1>, 0x00001090350cd6f9:<Array: length=16>, <Smi: 1>, 0x00001090350cdab9:<String: "/login?password=...">, 0x00003c65271041b9:<undefined>, 0x00003c65271041b9:<undefined>, 0x00003c6527104251:<false>, 0x00003c6527104211:<true>) at _http_common.js:43:33 fn=0x000010903509be09
    frame #12: 0x00001272d7d318fd <internal>
    frame #13: 0x00001272d7d15b62 <entry>
    frame #14: 0x00000001002daa88 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
    frame #15: 0x0000000100166a84 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 276
    frame #16: 0x0000000100684230 node`node::Parser::on_headers_complete_() + 506
    frame #17: 0x0000000100785707 node`http_parser_execute + 531
    frame #18: 0x0000000100684975 node`node::Parser::Execute(char*, unsigned long) + 85
    frame #19: 0x0000000100684c4f node`node::Parser::OnReadImpl(long, uv_buf_t const*, uv_handle_type, void*) + 107
    frame #20: 0x0000000100693c5d node`node::StreamWrap::OnReadCommon(uv_stream_s*, long, uv_buf_t const*, uv_handle_type) + 127
    frame #21: 0x0000000100793622 node`uv__stream_io + 1235
    frame #22: 0x000000010079a9fa node`uv__io_poll + 1554
    frame #23: 0x000000010078c1bb node`uv_run + 321
    frame #24: 0x0000000100667fbb node`node::Start(int, char**) + 583
    frame #25: 0x0000000100000a34 node`start + 52

llnode uses the LLDB API to walk the stack exactly as before but when LLDB can’t provide any details for a frame it checks to see if it is JavaScript and if it is fills it in.

With the native C/C++ stacks and JavaScript stacks integrated we can also see the calls made from the libuv down into the JavaScript code where we forced an exception. This is obviously a contrived example but when you are attempting to debug a problem that you suspect began in some native code and ended up in JavaScript (or vice-versa) then having the combined stacks is a big advantage.

Because we named it we can easily see that the callback myRequestListener was being run at frame #6. We can select that< frame in LLDB and use the llnode extensions to dump the source out:

(lldb) frame select 6
frame #6: 0x00001272d7e99408
->  0x1272d7e99408: movq   -0x8(%rbp), %rsi
    0x1272d7e9940c: addq   $0x8, %rsp
    0x1272d7e99410: movq   -0x58(%r13), %rax
    0x1272d7e99414: movabsq $0x2b42ae4319, %rbx       ; imm = 0x2B42AE4319 
(lldb) v8 source list
  5 server = http.createServer(function myRequestListener(req, res) {
  6   res.writeHead(200, {'Content-Type': 'text/plain'});
  7   res.end('Hello World\n');
  8   res.not_a_function()

Finally we can also inspect variables (with “v8 inspect”) so we can have a look at the arguments to the listener, lets look at the request object and then at the url field inside it:

(lldb) v8 inspect 0x00001090350cdb69
0x00001090350cdb69:<Object: IncomingMessage properties {
    ._readableState=0x00001090350cdc59:<Object: ReadableState>,
    .readable=0x00003c6527104211:<true>,
    .domain=0x00003c6527104101:<null>,
    ._events=0x00001090350ce079:<Object: Object>,
    ._eventsCount=<Smi: 0>,
    ._maxListeners=0x00003c65271041b9:<undefined>,
    .socket=0x00001090350c48a9:<Object: Socket>,
    .connection=0x00001090350c48a9:<Object: Socket>,
    .httpVersionMajor=<Smi: 1>,
    .httpVersionMinor=<Smi: 1>,
    .httpVersion=0x00001090350cf581:<String: "1.1">,
    .complete=0x00003c6527104251:<false>,
    .headers=0x00001090350ce6e9:<Object: Object>,
    .rawHeaders=0x00001090350ce8d9:<Array: length=16>,
    .trailers=0x00001090350ce941:<Object: Object>,
    .rawTrailers=0x00001090350ce9a9:<Array: length=0>,
    .upgrade=0x00003c6527104251:<false>,
    .url=0x00001090350cdab9:<String: "/customer/detail...">,
    .method=0x000010903509ba11:<String: "GET">,
    .statusCode=0x00003c6527104101:<null>,
    .statusMessage=0x00003c6527104101:<null>,
    .client=0x00001090350c48a9:<Object: Socket>,
    ._consuming=0x00003c6527104251:<false>,
    ._dumped=0x00003c6527104251:<false>}>
(lldb) v8 inspect --full-string 0x00001090350cdab9
0x00001090350cdab9:<String: "/customer/details?custno=12345">

We specify the object to inspect using it’s address in memory not it’s name. We can see that llnode lets us inspect variables and drill down to see the fields inside them. As well as being incredibly useful the “v8 inspect” and “v8 source list” commands also highlight that everything is in the core dump — your users data and even your code.

There’s much more you can do with llnode and ways you can use it to investigate problems at a higher level than just examining individual objects. Brendan Gregg has done an excellent article on using llnode for Node.js Memory Analysis. If there’s anything in the article you have questions about feel free to leave a comment here or ask me on Twitter.