Applications that dynamically allocate memory are prone to ‘memory leaks’.

When a region of memory is allocated by the application – for example, using  malloc(), a reference to that memory is obtained. The memory can then be used to represent application data, objects, and so on. The application is responsible for indicating when that memory is no longer required – for example, calling  free() – which deallocates the region, allowing it to be reused.

However, if the application fails to do this and discards the reference, that memory cannot be reused and is leaked.

Memory management in Swift

The Swift programming language employs a system called Automatic Reference Counting (ARC) to eliminate much of the burden and error of memory management. When an object is allocated, ARC allocates memory for them on your behalf, and also maintains a count of the number of references to the object. Once the reference count drops to zero, ARC knows this object can no longer be in use, and it automatically deallocates the memory.

However, memory leaks can still occur in Swift. ARC relies on retain counts to determine when objects can be deallocated, and can be defeated by ‘retain cycles’: graphs of objects that refer to each other. Objects in such graphs cannot be deallocated unless the cycle is broken.

And even though your application may be written in pure Swift, you may rely on a C library under the covers – such as a crypto library or database connector.

Tooling to debug memory leaks on the Mac platform is provided within Xcode’s Instruments – and tutorials on its use are fairly plentiful. But what do you do if encounter a leak that only manifests on Linux?

Valgrind and Massif

The Valgrind package includes a suite of tools that can be used to diagnose and debug a number of memory and threading-related bugs. One of the included tools is Massif, a heap profiler, which keeps track of the underlying allocations and deallocations your application performs – or that ARC performs, on your behalf – and can tell you where memory is being retained, at periodic intervals (‘snapshots’).

Identifying leaks

Memory usage typically rises and falls over time – for example, as different numbers of clients connect and perform requests. Measured over enough time, there is no upward or downward trend. However, if a memory leak is present, there will be an upward trend.

Such leaks can be readily spotted by system monitoring tools such as the  top  command, a monitoring framework such as SwiftMetrics, or your Cloud provider’s monitoring tools. The Resident Set Size (RSS) is the key metric, representing the real physical memory that is allocated to your process – a finite resource that can be exhausted. In contrast, Virtual Memory (VSZ) represents memory a process could access, and can be much higher than the RSS – this is not generally a cause for concern.

You may also be interested in investigating memory spikes: periods of high peak memory (RSS) consumption, but where that memory is eventually released. Massif can help here too – comparing the detailed snapshots before and during such a spike will help you determine where the extra memory is being allocated. It won’t necessarily tell you why, but it will guide you toward the area of code which may – for example – benefit from some extra logging or debugging output.

Installing the tools

For Ubuntu Linux, valgrind can be installed simply with:

sudo apt-get install valgrind

Preparing your application

Whilst there is no specific preparation required to use valgrind, it is recommended that you compile your application in debug mode. Whilst you can profile a release-mode (optimized) build, the compiler will likely have applied inlining and other optimizations that will mean the call stack in your memory profile does not match your source code, and line number information will not be available.

Generating a memory profile

This process is performed in two stages:

  1. Run your application through Massif, and cause the leak to occur. This may be simply letting the program execute, or it may involve interaction – or, in the case of a server process, the simulation of client load. Once you are finished, terminate the application; a massif log file will have been created.

    • Note that using valgrind causes applications to run far slower than normal – potentially 10x or more. If the event that causes your memory leak takes a certain number of operations to occur, bear this in mind while deciding how long to profile for.
  2. Post-process the log file into a human-readable report using the  ms_print  command, and optionally, use  swift-demangle  to convert the Swift symbols into a human-readable form.

Start your application with:

valgrind --tool=massif .build/debug/MyApplication

You should see some additional output from Massif, followed by the normal output of your application.

The number at the beginning of the Massif header is the process identifier (PID) of your application. After exercising your application and terminating it, you should now have a logfile named  massif.out.<PID>  in the current working directory.

Run the post-processor and swift demangler as follows:

ms_print massif.out.<PID> | swift-demangle > report.txt

…replacing  <PID> with the appropriate number. You can then inspect the report, which is made up of two main sections: a graph of the memory used over time, and a series of snapshots detailing the memory consumption.

A retain cycle example

Let’s create a simple demo that illustrates a retain cycle. This example models diners seated at a restaurant, with each Diner object having a reference to the one sitting to their left. Table A has one empty seat, while table B is fully occupied.

In this diagram, the blue arrows represent a reference between objects, and the green circles represent their retain count.
The corresponding sample code:

class Diner {
  let seat: Int
  var left: Diner? = nil

  convenience init(seat: Int) {
    self.init(seat: seat, left: nil)
  }

  init(seat: Int, left: Diner?) {
    self.seat = seat
    self.left = left
  }
}

print("Seating diners...")

for _ in 1...10000 {
  // Table A
  let andy = Diner(seat: 1)
  let chris = Diner(seat: 2, left: andy)
  let dave = Diner(seat: 3, left: chris)
  // Seat 4 is empty
  let enrique = Diner(seat: 5, left: nil)
  let helen = Diner(seat: 6, left: enrique)
  andy.left = helen

  // Table B
  let ian = Diner(seat: 1)
  let kye = Diner(seat: 2, left: ian)
  let mike = Diner(seat: 3, left: kye)
  let neil = Diner(seat: 4, left: mike)
  ian.left = neil
}

print("Finished.")

In order to simulate a gradual memory leak, ‘application load’ is applied – represented here by a simple for loop. The objects are created within the loop, and assigned to a loop local variable, which will go out of scope each time the loop repeats.

Table A does not represent a retain cycle: No-one is sitting to the right of Dave, and so Dave has a reference count of 1, whereas the other diners have a reference count of 2. Once the loop iteration completes, the reference count on Dave drops to zero and the object can be released. Doing so reduces the reference count of Chris to zero, and so on.

Table B contains a retain cycle: each Diner is referenced by another, and once created, they all have a reference count of 2. Once the loop iteration completes, the reference count of each Diner is 1 – and hence none can be released. These objects are now leaked: they remain in existence, but no other part of the application has a reference to them.

After building our Swift project with swift build, we can then execute it through Massif and generate a report:

    dave@ubuntu:$ valgrind --tool=massif .build/debug/RetCycleDemo
    ==24123== Massif, a heap profiler
    ==24123== Copyright (C) 2003-2013, and GNU GPL'd, by Nicholas Nethercote
    ==24123== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
    ==24123== Command: .build/debug/RetCycleDemo
    ==24123==

    Seating diners...
    Finished.
    ==24123==

    dave@ubuntu:$ ms_print massif.out.24123 | swift-demangle > report.txt

Let’s look at the elements of the report:

1. A graphical representation of the memory used by your application over time. Our demo has a straightforward linear memory leak, which looks like this:

        MB
    1.521^                                                                       #
         |                                                                    @@@#
         |                                                                 @@@@@@#
         |                                                              @@@@@@@@@#
         |                                                          @@@@@@@@@@@@@#
         |                                                       @@@@@@@@@@@@@@@@#
         |                                                    @@@@ @@@@@@@@@@@@@@#
         |                                                @@@@@@@@ @@@@@@@@@@@@@@#
         |                                             @@@@@@@@@@@ @@@@@@@@@@@@@@#
         |                                          @@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |                                      @@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |                                   @@@@@ @@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |                                @@@@ @@@ @@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |                             @@@@@@@ @@@ @@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |                         @@@@@@@@@@@ @@@ @@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |                      @@@@@@@@@@@@@@ @@@ @@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |                  @@@@@@@@@@@@@@@@@@ @@@ @@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |               @@@@@@ @@@@@@@@@@@@@@ @@@ @@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |            @@@@@@@@@ @@@@@@@@@@@@@@ @@@ @@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
         |        @@@@@@ @@@@@@ @@@@@@@@@@@@@@ @@@ @@@@@@@@@@@@@@@ @@@@@@@@@@@@@@#
       0 +----------------------------------------------------------------------->Mi
         0                                                                   53.01

2. A list of the snapshots that Massif captured during execution, including which snapshot represents the peak memory usage, and which snapshots are ‘detailed’ (including a breakdown of where memory is allocated by call stack):

    Number of snapshots: 75
     Detailed snapshots: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, ..., 71, 72, 73, 74 (peak)]

3. A table of memory statistics for each snapshot, interspersed with the breakdown for detailed snapshots:

    --------------------------------------------------------------------------------
      n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
    --------------------------------------------------------------------------------
      0              0                0                0             0            0
      1      3,443,318               40               32             8            0
      2      3,998,091            7,376            6,328         1,048            0 
    …

The final detailed snapshot, number 74, which is also the peak for this execution of our application, gives us the clearest indication of where memory is being leaked:

    --------------------------------------------------------------------------------
      n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
    --------------------------------------------------------------------------------
     74     55,587,768        1,595,056        1,276,472       318,584            0
    80.03% (1,276,472B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
    ->79.81% (1,272,992B) 0x5240F44: swift_slowAlloc (in /home/djones6/.swiftenv/versions/4.0.3/usr/lib/swift/linux/libswiftCore.so)
    | ->79.81% (1,272,992B) 0x5240F7D: _swift_allocObject_ (in /home/djones6/.swiftenv/versions/4.0.3/usr/lib/swift/linux/libswiftCore.so)
    |   ->59.86% (954,752B) 0x4016EB: RetCycleDemo.Diner.__allocating_init(seat: Swift.Int, left: RetCycleDemo.Diner?) -> RetCycleDemo.Diner (main.swift:0)
    |   | ->19.95% (318,208B) 0x401225: main (main.swift:29)
    |   | |
    |   | ->19.95% (318,208B) 0x401251: main (main.swift:30)
    |   | |
    |   | ->19.95% (318,208B) 0x40127D: main (main.swift:31)
    |   | |
    |   | ->00.01% (128B) in 1+ places, all below ms_print's threshold (01.00%)
    |   |
    |   ->19.95% (318,240B) 0x40162E: RetCycleDemo.Diner.__allocating_init(seat: Swift.Int) -> RetCycleDemo.Diner (main.swift:0)
    |   | ->19.95% (318,208B) 0x4011F9: main (main.swift:28)
    |   | |

The summary line indicates that a total of 1,276,472 bytes of useful heap has been allocated: this is the memory our application can actually use.

  • The other memory, represented by ‘extra heap’, is memory that was allocated in excess of what was asked for – due to alignment and administrative overhead of a memory region. This is quite a high proportion (20%) in this simple example, as we have many small allocations.

Next follows the breakdown of allocations by call stack, with the highest indent level representing the top of the stack (typically your application code). The percentages are in terms of the total heap, not just the useful heap. 1,272,992 bytes (79.8% of the total heap, or 99.7% of the useful heap) was allocated by  swift_allocObject.

  • The objects that caused these allocations are all instances of RetCycleDemo.Diner, from two different initializers:

    • 59.86% (954,752B) from  RetCycleDemo.Diner.__allocating_init(seat: Swift.Int, left: RetCycleDemo.Diner?)
    • 19.95% (318,240B) from  RetCycleDemo.Diner.__allocating_init(seat: Swift.Int)

The report also shows where the initializer was called: the first was called from  main.swift  lines 29 – 31, the second from  main.swift  line 28. This corresponds nicely to our application’s source code for table B. Because these lines repeatedly allocate objects that form a retain cycle, they cause the memory growth that Massif has reported.

There is no reference to the initializers called from lines 19 – 24 (table A), because these objects do not create a retain cycle, and so there will be at most one instance of these objects at any time.

Summary

Remember that Massif isn’t going to tell you why a piece of code is leaking, but it will at least tell you which objects and functions are involved in the leak, which will guide further debugging or experimentation. In the example above, Massif does not tell us that there is a retain cycle between these objects, but the fact that they are responsible for an ever-increasing memory footprint means we can infer that they are being retained.






Join the discussion on Slack Learn more at kitura.io Star Kitura on GitHub

Leave a Reply