This article is aimed as being a primer for developers who are interested in using core dumps to help them debug Node.js problems. If you are a C/C++ developer who is comfortable with core dumps, ulimits, gcore and debuggers then you are probably comfortable with using core dumps. Being able to integrate JavaScript debugging with C/C++ may be useful if you are working on native npm modulues or other C/C++ code that interacts with JavaScript. If you’re a pure Node.js developer wondering if using core dumps is something you should care about or just be aware of in case of an emergency later on then hopefully this will provide a starting place.

In this article I’ll give a basic demonstration of how you can use llnode to give you a better view of how your JavaScript code is invoked by Node.js. In future posts I’ll cover other commands and how they can help with specific types of problem.

A core dump is a snapshot of a processes memory at a particular point in time. There are many ways to create one but the most usual is a crash within the process that causes the operating system to create a core dump and terminate the process. Generally they accessed using a debugger such as lldb or gdb.

llnode is a plugin for the lldb debugger that lets it make sense of many of the v8 structures within a core dump. You can use it to help understand what the state of your program was at the time the core dump was generated.

You can find the instructions on how to compile and install llnode on the projects github page.

Core dumps

If you aren’t really familiar with core dumps there’s a couple of things you should know about them, although really they are the same thing:

  • Core dumps are a snapshot of your processes entire memory so core dumps are big. Because of that although most operating systems can create one when a process crashes this is often disabled by default. This is the reason they are often disabled, it’s bad if a process crashes frequently and needs restarting. It’s disastrous if you set it up to be restarted automatically but it keeps crashing and fills your disk.
  • Core dumps are a snapshot of your processes entire memory which includes things like usernames and passwords and other things you don’t want to share. Giving someone a core dump of your application gives them access to all the data it had in memory at the time the core dump was created.
  • There’s several ways to get a core dump. The most well known is the OS generating one when your application crashes. Node.js has a flag –abort-on-uncaught-exception which makes it easy to get a core dump from JavaScript code by forcing an abort when an exception crashes your program and that’s a good way to experiment with llnode but we won’t need that for our examples below. Since core dumps are often disabled to stop crashing applications filling up your disk we should make sure they are enabled. The basic way to re-enable this is to change the ulimit for core files from 0 to unlimited:

    $ ulimit -a
    core file size          (blocks, -c) 0
    data seg size           (kbytes, -d) unlimited
    ....
    
    $ ulimit -c unlimited
    $ ulimit -a
    core file size          (blocks, -c) unlimited
    data seg size           (kbytes, -d) unlimited
    ....
    

    Setting the ulimit at the terminal in this way will change the setting for processes started within that terminal session but is not a permanent system wide change.

    On Mac OS this should cause core files to be created in /cores. The core file will be named /cores/core.[process_id].

    On Linux, by default, it would allow them to be created in the current directory. On Linux there are a number of other configuration settings that can change what happens with a core dump, if they are created, where they are created and what they are called. In particular the setting in /proc/sys/kernel/core_pattern which can change the file name or even pipe the core dump to another tool. You can probably get up to date instructions for your distribution just by Googling “enable core dumps [distro-name]”. A good rule of thumb is to check the syslog after a core dump has been generated as there will often be a message in there telling you where it went.

    llnode Example

    Starting with a simple (and broken) web server:

    var http = require('http');
    var host = "127.0.0.1"
    var port = 1338
    
    server = http.createServer(function myRequestListener(req, res) {
      res.writeHead(200, {'Content-Type': 'text/plain'});
      res.end('Hello World\n');
      res.not_a_function()
    }).listen(port, host);
    
    console.log(`Server process ${process.pid} running at http://${host}:${port}/`);
    

    This will only run once, the exception inside the requestListener function. (I’ve named my request listener as it makes for a better stack trace. I don’t know of any technical reasons not to do that and it does make debugging easier, native or otherwise.)

    If you run this with:

    $ node --abort-on-uncaught-exception httpexception.js 
    Server process 14104 running at http://127.0.0.1:1338/
    Uncaught TypeError: res.not_a_function is not a function
    
    FROM
    Server.myRequestListener (/Users/hhellyer/work/testscripts/httpexception.js:8:7)
    emitTwo (events.js:87:13)
    Server.emit (events.js:172:7)
    HTTPParser.parserOnIncoming [as onIncoming] (_http_server.js:533:12)
    HTTPParser.parserOnHeadersComplete (_http_common.js:103:23)
    Illegal instruction: 4 (core dumped)
    

    Luckily the example prints out the process id and that means we know that (on Mac OSX) the core file will be /cores/core.14104, it’s there and it’s fairly big:

    $ ls -lrth /cores/core.14104 
    -r--------  1 hhellyer  admin   1.2G 11 Aug 15:39 /cores/core.14104
    

    Now we can open this with lldb. We specify the executable and the core file to use (with -c) and then run the standard backtrace command “bt” to produce a backtrace of our failure as a native call stack. There is also an llnode command called “v8 bt” this does the same as backtrace but tries to fill in any stack frames that lldb can’t find symbols for by seeing if they are JavaScript frames.

    $ lldb node -c /cores/core.14104
    
    (lldb) bt
    * thread #1: tid = 0x0000, 0x000000010079f162 node`v8::base::OS::Abort() + 18, stop reason = signal SIGSTOP
      * frame #0: 0x000000010079f162 node`v8::base::OS::Abort() + 18
        frame #1: 0x000000010042cf20 node`v8::internal::Isolate::Throw(v8::internal::Object*, v8::internal::MessageLocation*) + 416
        frame #2: 0x000000010052f78f node`v8::internal::Runtime_Throw(int, v8::internal::Object**, v8::internal::Isolate*) + 47
        frame #3: 0x00001272d7d0963b
        frame #4: 0x00001272d7ea6345
        frame #5: 0x00001272d7d09ff7
        frame #6: 0x00001272d7e99408
        frame #7: 0x00001272d7e1da52
        frame #8: 0x00001272d7e1d57b
        frame #9: 0x00001272d7d09ff7
        frame #10: 0x00001272d7e977bf
        frame #11: 0x00001272d7e95413
        frame #12: 0x00001272d7d318fd
        frame #13: 0x00001272d7d15b62
        frame #14: 0x00000001002daa88 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
        frame #15: 0x0000000100166a84 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 276
        frame #16: 0x0000000100684230 node`node::Parser::on_headers_complete_() + 506
        frame #17: 0x0000000100785707 node`http_parser_execute + 531
        frame #18: 0x0000000100684975 node`node::Parser::Execute(char*, unsigned long) + 85
        frame #19: 0x0000000100684c4f node`node::Parser::OnReadImpl(long, uv_buf_t const*, uv_handle_type, void*) + 107
        frame #20: 0x0000000100693c5d node`node::StreamWrap::OnReadCommon(uv_stream_s*, long, uv_buf_t const*, uv_handle_type) + 127
        frame #21: 0x0000000100793622 node`uv__stream_io + 1235
        frame #22: 0x000000010079a9fa node`uv__io_poll + 1554
        frame #23: 0x000000010078c1bb node`uv_run + 321
        frame #24: 0x0000000100667fbb node`node::Start(int, char**) + 583
        frame #25: 0x0000000100000a34 node`start + 52
    

    The problem with this backtrace is obvious by looking at stack frames #3 to #13. That’s where v8 is executing your JavaScript code. Because it’s running code v8 has generated at runtime and not compiled C/C++ the debugger has way to find symbols (names) for the functions that are being executed and the backtrace produced by “bt” has gaps. If we use the llnode command “v8 bt” to walk the stack instead we get a better result:

    (lldb) v8 bt
     * thread #1: tid = 0x0000, 0x000000010079f162 node`v8::base::OS::Abort() + 18, stop reason = signal SIGSTOP
      * frame #0: 0x000000010079f162 node`v8::base::OS::Abort() + 18
        frame #1: 0x000000010042cf20 node`v8::internal::Isolate::Throw(v8::internal::Object*, v8::internal::MessageLocation*) + 416
        frame #2: 0x000000010052f78f node`v8::internal::Runtime_Throw(int, v8::internal::Object**, v8::internal::Isolate*) + 47
        frame #3: 0x00001272d7d0963b <internal code>
        frame #4: 0x00001272d7ea6345 CALL_NON_FUNCTION(this=0x00003c65271041b9:<undefined>) at native runtime.js:1:10 fn=0x00003c652719f951
        frame #5: 0x00001272d7d09ff7 <adaptor>
        frame #6: 0x00001272d7e99408 myRequestListener(this=0x00001090350042e1:<Object: Server>, 0x00001090350cdb69:<Object: IncomingMessage>, 0x00001090350cfde9:<Object: ServerResponse>) at /Users/hhellyer/work/testscripts/httpexception.js:5:54 fn=0x000010903508eca1
        frame #7: 0x00001272d7e1da52 emitTwo(this=0x00003c65271041b9:<undefined>, 0x000010903508eca1:<function: myRequestListener at /Users/hhellyer/work/testscripts/httpexception.js:5:54>, 0x00003c6527104211:<true>, 0x00001090350042e1:<Object: Server>, 0x00001090350cdb69:<Object: IncomingMessage>, 0x00001090350cfde9:<Object: ServerResponse>) at events.js:85:17 fn=0x00001e84e0224b69
        frame #8: 0x00001272d7e1d57b emit(this=0x00001090350042e1:<Object: Server>, 0x0000002b42a10fd9:<String: "request">) at events.js:117:44 fn=0x00003c65271ca4d9
        frame #9: 0x00001272d7d09ff7 <adaptor>
        frame #10: 0x00001272d7e977bf parserOnIncoming(this=0x00001090350c7b51:<Object: HTTPParser>, 0x00001090350cdb69:<Object: IncomingMessage>, 0x00003c6527104211:<true>) at _http_server.js:459:28 fn=0x00001090350c68b1
        frame #11: 0x00001272d7e95413 parserOnHeadersComplete(this=0x00001090350c7b51:<Object: HTTPParser>, <Smi: 1>, <Smi: 1>, 0x00001090350cd6f9:<Array: length=16>, <Smi: 1>, 0x00001090350cdab9:<String: "/login?password=...">, 0x00003c65271041b9:<undefined>, 0x00003c65271041b9:<undefined>, 0x00003c6527104251:<false>, 0x00003c6527104211:<true>) at _http_common.js:43:33 fn=0x000010903509be09
        frame #12: 0x00001272d7d318fd <internal>
        frame #13: 0x00001272d7d15b62 <entry>
        frame #14: 0x00000001002daa88 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
        frame #15: 0x0000000100166a84 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 276
        frame #16: 0x0000000100684230 node`node::Parser::on_headers_complete_() + 506
        frame #17: 0x0000000100785707 node`http_parser_execute + 531
        frame #18: 0x0000000100684975 node`node::Parser::Execute(char*, unsigned long) + 85
        frame #19: 0x0000000100684c4f node`node::Parser::OnReadImpl(long, uv_buf_t const*, uv_handle_type, void*) + 107
        frame #20: 0x0000000100693c5d node`node::StreamWrap::OnReadCommon(uv_stream_s*, long, uv_buf_t const*, uv_handle_type) + 127
        frame #21: 0x0000000100793622 node`uv__stream_io + 1235
        frame #22: 0x000000010079a9fa node`uv__io_poll + 1554
        frame #23: 0x000000010078c1bb node`uv_run + 321
        frame #24: 0x0000000100667fbb node`node::Start(int, char**) + 583
        frame #25: 0x0000000100000a34 node`start + 52
    

    llnode uses the lldb API to walk the stack exactly as before but when lldb can’t provide any details for a frame it checks to see if it is JavaScript and if it is fills it in.

    With the native C/C++ stacks and JavaScript stacks integrated we can also see the calls made from the libuv down into the JavaScript code where we forced an exception. This is obviously a contrived example but when you are attempting to debug a problem that you suspect began in some native code and ended up in JavaScript (or vice-versa) then having the combined stacks is a big advantage.

    Because we named it we can easily see that the callback myRequestListener was being run at frame #6. We can select that< frame in lldb and use the llnode extensions to dump the source out:

    (lldb) frame select 6
    frame #6: 0x00001272d7e99408
    ->  0x1272d7e99408: movq   -0x8(%rbp), %rsi
        0x1272d7e9940c: addq   $0x8, %rsp
        0x1272d7e99410: movq   -0x58(%r13), %rax
        0x1272d7e99414: movabsq $0x2b42ae4319, %rbx       ; imm = 0x2B42AE4319 
    (lldb) v8 source list
      5 server = http.createServer(function myRequestListener(req, res) {
      6   res.writeHead(200, {'Content-Type': 'text/plain'});
      7   res.end('Hello World\n');
      8   res.not_a_function()
    

    Finally we can also inspect variables (with “v8 inspect”) so we can have a look at the arguments to the listener, lets look at the request object and then at the url field inside it:

    (lldb) v8 inspect 0x00001090350cdb69
    0x00001090350cdb69:<Object: IncomingMessage properties {
        ._readableState=0x00001090350cdc59:<Object: ReadableState>,
        .readable=0x00003c6527104211:<true>,
        .domain=0x00003c6527104101:<null>,
        ._events=0x00001090350ce079:<Object: Object>,
        ._eventsCount=<Smi: 0>,
        ._maxListeners=0x00003c65271041b9:<undefined>,
        .socket=0x00001090350c48a9:<Object: Socket>,
        .connection=0x00001090350c48a9:<Object: Socket>,
        .httpVersionMajor=<Smi: 1>,
        .httpVersionMinor=<Smi: 1>,
        .httpVersion=0x00001090350cf581:<String: "1.1">,
        .complete=0x00003c6527104251:<false>,
        .headers=0x00001090350ce6e9:<Object: Object>,
        .rawHeaders=0x00001090350ce8d9:<Array: length=16>,
        .trailers=0x00001090350ce941:<Object: Object>,
        .rawTrailers=0x00001090350ce9a9:<Array: length=0>,
        .upgrade=0x00003c6527104251:<false>,
        .url=0x00001090350cdab9:<String: "/customer/detail...">,
        .method=0x000010903509ba11:<String: "GET">,
        .statusCode=0x00003c6527104101:<null>,
        .statusMessage=0x00003c6527104101:<null>,
        .client=0x00001090350c48a9:<Object: Socket>,
        ._consuming=0x00003c6527104251:<false>,
        ._dumped=0x00003c6527104251:<false>}>
    (lldb) v8 inspect --full-string 0x00001090350cdab9
    0x00001090350cdab9:<String: "/customer/details?custno=12345">
    

    We specify the object to inspect using it’s address in memory not it’s name. We can see that llnode lets us inspect variables and drill down to see the fields inside them. As well as being incredibly useful the “v8 inspect” and “v8 source list” commands also highlight that everything is in the core dump – your users data and even your code.

    There’s much more you can do with llnode and ways you can use it to investigate problems at a higher level than just examining individual objects. Brendan Gregg has done an excellent article on using llnode for Node.js Memory Analysis. If there’s anything in the article you have questions about feel free to leave a comment here or ask me on twitter.

    9 comments on"Exploring Node.js core dumps using the llnode plugin for lldb."

    1. […] The findrefs command extends llnode to allow you to discover which objects refer to another object and enhances the memory analysis capabilities of llnode. llnode is a plugin that allows you to explore Node.js core dumps with the lldb debugger. […]

    2. […] Hellyer has provided an introduction to the use of the LLDB debugger in his post Exploring Node.js core dumps using the llnode plugin for lldb and describes a new command available in the llnode plugin here nodeinfo command for llnode. He has […]

    3. Hello Howard,

      I followed this tutorial on RHEL6.8 + LLVM/LLDB3.8.1 + LLNODE + NODE4.6.2 from within my VirtualBox5.1.8, the coredump file generated and lldb/llnode can analyze it perfectly.

      However, in our dev server(same OS and same lldb/nodejs version), the coredump file generated, but the coredump file cannot be analyzed by lldb, lldb itself core dumpped when running `lldb node -c coredump_file_name`

      I then use gdb to debug the coredump file generated by lldb, the details as follows:

      “`
      Core was generated by `lldb node -c core.43491′.
      Program terminated with signal 11, Segmentation fault.
      #0 0x00007fc30d40f940 in lldb_private::ArchSpec::GetMachine() const ()
      from /usr/local/bin/../lib/liblldb.so.3.8.1
      Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.7.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-42z1.el6_7.x86_64 libcom_err-1.41.12-22.el6.x86_64 libedit-2.11-4.20080712cvs.1.el6.x86_64 libgcc-4.4.7-16.el6.x86_64 libselinux-2.0.94-5.8.el6.x86_64 libstdc++-4.4.7-16.el6.x86_64 libuuid-2.17.2-12.18.el6.x86_64 libxml2-2.7.6-21.el6_8.1.x86_64 ncurses-libs-5.7-4.20090207.el6.x86_64 openssl-1.0.1e-42.el6_7.4.x86_64 zlib-1.2.3-29.el6.x86_64
      (gdb) bt
      #0 0x00007fc30d40f940 in lldb_private::ArchSpec::GetMachine() const ()
      from /usr/local/bin/../lib/liblldb.so.3.8.1
      #1 0x00007fc30d8ec0ff in RegisterContextPOSIX_x86::RegisterContextPOSIX_x86(lldb_private::Thread&, unsigned int, lldb_private::RegisterInfoInterface*) ()
      from /usr/local/bin/../lib/liblldb.so.3.8.1
      #2 0x00007fc30d7ba0c4 in RegisterContextCorePOSIX_x86_64::RegisterContextCorePOSIX_x86_64(lldb_private::Thread&, lldb_private::RegisterInfoInterface*, lldb_private::DataExtractor const&, lldb_private::DataExtractor const&) ()
      from /usr/local/bin/../lib/liblldb.so.3.8.1
      #3 0x00007fc30d7b87f2 in ThreadElfCore::CreateRegisterContextForFrame(lldb_private::StackFrame*) () from /usr/local/bin/../lib/liblldb.so.3.8.1
      #4 0x00007fc30d7b806d in ThreadElfCore::GetRegisterContext ()
      from /usr/local/bin/../lib/liblldb.so.3.8.1
      #5 0x00007fc30d59b16c in lldb_private::StackFrameList::GetFramesUpTo(unsigned int) () from /usr/local/bin/../lib/liblldb.so.3.8.1
      #6 0x00007fc30d59bae6 in lldb_private::StackFrameList::ResetCurrentInlinedDepth() () from /usr/local/bin/../lib/liblldb.so.3.8.1
      #7 0x00007fc30d5c21e2 in lldb_private::Thread::ShouldStop(lldb_private::Event*)
      () from /usr/local/bin/../lib/liblldb.so.3.8.1
      #8 0x00007fc30d5c90de in lldb_private::ThreadList::ShouldStop(lldb_private::Event*) () from /usr/local/bin/../lib/liblldb.so.3.8.1
      #9 0x00007fc30d589b13 in lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event*) () from /usr/local/bin/../lib/liblldb.so.3.8.1
      #10 0x00007fc30d589bc1 in lldb_private::Process::HandlePrivateEvent(std::shared_ptr&) () from /usr/local/bin/../lib/liblldb.so.3.8.1
      #11 0x00007fc30d58acc3 in lldb_private::Process::RunPrivateStateThread(bool) ()
      —Type to continue, or q to quit—
      from /usr/local/bin/../lib/liblldb.so.3.8.1
      #12 0x00007fc30d4072a2 in lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) () from /usr/local/bin/../lib/liblldb.so.3.8.1
      #13 0x0000003d52807aa1 in start_thread () from /lib64/libpthread.so.0
      #14 0x0000003d520e893d in clone () from /lib64/libc.so.6
      “`

      Any suggestions? Many thanks.

      • Howard Hellyer December 12, 2016

        That looks like a bug in lldb’s architecture handling code. We’ve seen a couple of issues around opening core dumps generated in different ways. Was the core from your production server generated via gcore or a signal/crash?
        You may get the same result but it might be worth trying copying the core file from the production machine to your development environment where you have been able to open a core dump before.

      • Richard Chamberlain December 13, 2016

        Stack trace looks the same as in this LLDB bugzilla.
        https://llvm.org/bugs/show_bug.cgi?id=25106

        They indicate that it was fixed in trunk i.e. LLDB 4.0. We should perhaps get the fix back-ported

    4. […] llnode plugin for lldb allows you to investigate Node.js problems using lldb. The gencore npm package exists to generate core dumps in situations where other tools may not work […]

    5. […] and causing node to generate a core file on failure is a production best-practice. (See Exploring Core Dumps for information on how to use core files for post-mortem debugging.) Despite this, most clouds run […]

    6. […] snapshot of the application’s process memory, written as a binary file. You can use the lldb debugger with the llnode plugin to inspect the core […]

    7. […] snapshot of the application’s process memory, written as a binary file. You can use the lldb debugger with the llnode plugin to inspect the core […]

    Join The Discussion

    Your email address will not be published. Required fields are marked *