Mead's (Brief) Guide To Memory Debuggers

(An Introduction to Dr. Memory and Valgrind)
Dr. Memory examples
Valgrind examples

Background

There are several tools available to help you find memory bugs in your programs. This document will discuss two of them: Dr. Memory and Valgrind.
  1. Dr. Memory"Dr. Memory is a memory monitoring tool capable of identifying memory-related programming errors such as accesses of uninitialized memory, accesses to unaddressable memory (including outside of allocated heap units and heap underflow and overflow), accesses to freed memory, double frees, memory leaks, and (on Windows) handle leaks, GDI API usage errors, and accesses to un-reserved thread local storage slots."

    Dr. Memory operates on unmodified application binaries running on Windows, Linux, Mac, or Android on commodity IA-32, AMD64, and ARM hardware.

    Pros:

    Cons:

  2. Valgrind"Valgrind is a GPL'd system for debugging and profiling Linux and Mac programs. With Valgrind's tool suite you can automatically detect many memory management and threading bugs, avoiding hours of frustrating bug-hunting, making your programs more stable. You can also perform detailed profiling to help speed up your programs."

    Valgrind runs on several popular platforms, such as x86, AMD64 and PPC32. Read more about the benefits of using Valgrind here.

    Pros:

    Cons:

Dr. Memory

On Windows, you can use either MinGW or Microsoft's compiler to build programs that Dr. Memory can check. In order for Dr. Memory to work correctly, you may need to compile your program with special command line options. Refer to Preparing Your Application on their website for detailed information.

Although Dr. Memory will work with Windows, Mac, and Linux, I'm only going to show how to use it under Windows because Valgrind is a better choice if you're running Mac or Linux. Also, the version of Dr. Memory used for these examples is 1.11.0 -- build 2. If you're running Windows 10 (or later), you should use the latest version as Windows is a moving target.

We're going to use this file: mem.bugs.cpp (HTML), that has several memory-related bugs:

  1. test1 - simple memory leak
  2. test2 - allocate with new[], deallocate with delete
  3. test3 - allocate with new, deallocate with delete[]
  4. test4 - allocate with malloc, deallocate with delete
  5. test5 - allocate with new, deallocate with free
  6. test6 - buffer overflow
  7. test7 - unitialized read
Both memory debuggers can detect more errors than the ones listed, but this should give you an idea of how to use them. To experiment with code that has more errors, try this one: mem.leaks.cpp (HTML). You can see all of the output from both Microsoft (32-bit and 64-bit) and MinGW (32-bit and 64-bit) here.

Quick Links:

Valgrind

Valgrind is only available for Linux and Mac, and I will be showing it on Linux. Valgrind is also a suite of tools that can look for problems other than memory. Read the documentation to see all of the things that it can do. The version of Valgrind used for these examples is 3.10.1, but any newer version should work the same.

If you are running a recent version of Windows, you can enable the Windows Subsystem for Linux (WSL) which will give you a working Linux system. Once set up, you can use almost any of the plethora of developer tools for Linux. I wrote a brief tutorial on how to set up WSL here.

The first thing is to compile. The recommended options are -g (include debugging information) and -O0 (disable optimizations, that's an uppercase letter 'O' followed by a zero):
g++ -g -O0 -o gnu64 mem.bugs.cpp
Valgrind works the same way in both 32-bit and 64-bit programs, so I'm just going to show the 64-bit output.

Running the program (test #1, memory leak) under Valgrind:

valgrind ./gnu64 1
produces this output:
==25831== Memcheck, a memory error detector
==25831== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==25831== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==25831== Command: ./gnu64 1
==25831== 
==25831== 
==25831== HEAP SUMMARY:
==25831==     in use at exit: 72,827 bytes in 2 blocks
==25831==   total heap usage: 2 allocs, 0 frees, 72,827 bytes allocated
==25831== 
==25831== LEAK SUMMARY:
==25831==    definitely lost: 123 bytes in 1 blocks
==25831==    indirectly lost: 0 bytes in 0 blocks
==25831==      possibly lost: 0 bytes in 0 blocks
==25831==    still reachable: 72,704 bytes in 1 blocks
==25831==         suppressed: 0 bytes in 0 blocks
==25831== Rerun with --leak-check=full to see details of leaked memory
==25831== 
==25831== For counts of detected and suppressed errors, rerun with: -v
==25831== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Note: The numbers in the left-hand column are the process ID (PID) of the running program and can be ignored.

I highlighted the important part which shows the 123 bytes of memory that was not deallocated. There's also a false-positive leak of 72,827 bytes. I'll show how to suppress that later. There is a lot of "noise" being output, so we're going to suppress a lot of that. Do that with the --quiet option:

valgrind --quiet ./gnu64 1
Unfortunately, this has suppressed all of the output so we need to enable some of it:
valgrind --quiet --leak-check=full ./gnu64 1
Now the output looks like this:
==26112== 123 bytes in 1 blocks are definitely lost in loss record 1 of 2
==26112==    at 0x4C2B800: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26112==    by 0x400AA4: test1() (mem.bugs.cpp:7)
==26112==    by 0x400C88: main (mem.bugs.cpp:72)
==26112== 
==26112== 72,704 bytes in 1 blocks are still reachable in loss record 2 of 2
==26112==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26112==    by 0x4E9FC25: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==26112==    by 0x40101D9: call_init.part.0 (dl-init.c:78)
==26112==    by 0x40102C2: call_init (dl-init.c:36)
==26112==    by 0x40102C2: _dl_init (dl-init.c:126)
==26112==    by 0x4001299: ??? (in /lib/x86_64-linux-gnu/ld-2.19.so)
==26112==    by 0x1: ???
==26112==    by 0xFFEFFFD46: ???
==26112==    by 0xFFEFFFD4E: ???
==26112== 
It shows exactly what we wanted, including the filename and line numbers where to look.

Ok, let's suppress the false-positive since it's going to show up in every test. I'm just going to give a quick overview of how to do that. For more details, read Suppressing errors and Writing suppression files.

First, you have to generate a suppression file:

valgrind --quiet --leak-check=full --gen-suppressions=yes ./gnu64 1
Now, when you run it you'll see this:
==26594== 123 bytes in 1 blocks are definitely lost in loss record 1 of 2
==26594==    at 0x4C2B800: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26594==    by 0x400AA4: test1() (mem.bugs.cpp:7)
==26594==    by 0x400C88: main (mem.bugs.cpp:72)
==26594== 
==26594== 
==26594== ---- Print suppression ? --- [Return/N/n/Y/y/C/c] ---- 
Valgrind has paused and the last line is prompting you to answer the question Print suppression ? Since the memory leak being shown is an actual memory leak, you DO NOT want to suppress that. If you do, Valgrind will not report it as a leak anymore, which is not what we want. Press n to skip the suppression.

Now, Valgrind displays the false-postive and asks again if you want to suppress this:

==26894== 72,704 bytes in 1 blocks are still reachable in loss record 2 of 2
==26894==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26894==    by 0x4E9FC25: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==26894==    by 0x40101D9: call_init.part.0 (dl-init.c:78)
==26894==    by 0x40102C2: call_init (dl-init.c:36)
==26894==    by 0x40102C2: _dl_init (dl-init.c:126)
==26894==    by 0x4001299: ??? (in /lib/x86_64-linux-gnu/ld-2.19.so)
==26894==    by 0x1: ???
==26894==    by 0xFFEFFFD46: ???
==26894==    by 0xFFEFFFD4E: ???
==26894== 
==26894== 
==26894== ---- Print suppression ? --- [Return/N/n/Y/y/C/c] ---- 
You'll notice that none of the lines reference our source code (mem.bugs.cpp), which is a good sign that the leak is not in our code.

This time, type y to generate the suppression output. This is what Valgrind displays on my computer:

{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: reachable
   fun:malloc
   obj:/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
   fun:call_init.part.0
   fun:call_init
   fun:_dl_init
   obj:/lib/x86_64-linux-gnu/ld-2.19.so
   obj:*
   obj:*
   obj:*
}
You need to copy the text (including the curly braces) and paste it into a text file that will be used as the suppression file. You can name the text file anything you want. I usually name my suppression files false.supp, but you can name it anything.

Note: Do NOT copy the text from this web page. Every system produces a different suppression file. If you just copy the one from this web page, it won't work. You must generate the file on your particular computer.

Now, when you run Valgrind, you will specify this suppression file on the command line so that Valgrind will ignore (suppress) the false-positive:

valgrind --quiet --leak-check=full --suppressions=false.supp ./gnu64 1
This assumes that you put the suppression file in the same directory that you are running in. Now, the output looks like this:
==27200== 123 bytes in 1 blocks are definitely lost in loss record 1 of 2
==27200==    at 0x4C2B800: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27200==    by 0x400AA4: test1() (mem.bugs.cpp:7)
==27200==    by 0x400C88: main (mem.bugs.cpp:72)
==27200== 
which contains just the memory leak that we put in the code.

Pro Tip: Since this false-positive will show up in every program you run, you'll want to put this file in some other location and refer to that location. You don't want to have dozens (or hundreds) of these files around because, as you'll shortly see, when the suppression file needs to be updated, you don't want to have to update 100 of them. You could put the file in your bin directory in your home directory (for example: /home/user/bin/false.supp). Yeah, it's not a binary file (executable), but the bin directory in my home directory is a convenient location and exists on all systems I use.

Now, run Valgrind like this:
valgrind --quiet --leak-check=full --suppressions=/home/user/bin/false.supp ./gnu64 1
Now, you can use this suppression file for all of your programs, regardless of where they are on your computer.

Note: The suppression files are specific to the versions of the operating system, compiler, libraries, and Valgrind. You may find that, if you update any of these things that you will have to re-generate the suppression file. You can see this in the suppression file as it references version 6.0.25 of libstdc++.so. If this library gets updated, the suppression file will no longer work.

This is the only false-positive that Valgrind is showing and it's coming from the standard C++ library libstdc++.so. Depending on what kinds of other libraries/code you include in your program, you may get more, sometimes MANY more, false-positives. You would simply run Valgrind with the --gen-suppressions=yes option and copy/paste the output into the same suppression file. That's why there are curly braces in the output. The lines of text between each pair of curly braces refers to one false-positive. There is no limit to how many of these suppression lines can be put into the file. (I've got suppression files that have over 5,000 lines in them! Thank you, Qt libraries...)

Finally, you'll notice that the first line of the suppression text was this:

{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   . . .
}
This is so you can give a name to the suppression block. It's unused by Valgrind, but it allows you to document what the block is suppressing. I usually change it to this:
{
   [false positive from libstdc++]
   Memcheck:Leak
   . . .
}
It doesn't matter what you call it but it must be present. Valgrind skips the first non-blank line of each block, so if it's not there, the suppression won't work.

I've just scratched the surface of what Valgrind can do. For more information, consult the Valgrind User Manual.

You can see all of the output from Valgrind using GNU's 64-bit compiler with the test file mem.bugs.cpp here.


Tracking File Descriptors with Valgrind

Valgrind can do a lot more than just detect memory bugs. I'm going to show one more thing that Valgrind can detect. Many beginning programmers fail to close files when they are done with them. This can lead to problems.

Tracking file descriptors is a fancy way of saying that you can detect files that were not closed (e.g. via fclose). File handles (descriptors) are a finite resource just like memory, and if you fail to close them, you can run out of them (just like memory). Valgrind has an option that enables this check. Here's a starting place (fd.c):

int main()
{
  return 0;
}
This is a complete program that doesn't do anything. Or does it? Build it:
gcc fd.c -g -o fd
and run Valgrind on it with the appropriate option:
valgrind -q --track-fds=yes ./fd
and this is the output:
==31318== FILE DESCRIPTORS: 3 open at exit.
==31318== Open file descriptor 2: /dev/pts/2
==31318==    <inherited from parent>
==31318== 
==31318== Open file descriptor 1: /dev/pts/2
==31318==    <inherited from parent>
==31318== 
==31318== Open file descriptor 0: /dev/pts/2
==31318==    <inherited from parent>
Valgrind is saying that we have 3 leaks. They are not memory leaks, they are resource leaks, file descriptors, to be exact. But the code didn't open any files! Think back to your beginning C/C++ programming course and you'll remember that when your program runs, three "files" are opened for you:
  1. stdin - By default, this is the keyboard (opened for input, fd 0).
  2. stdout - By default, this is the screen (opened for output, fd 1).
  3. stderr - By default, this is the screen (opened for output, fd 2).

These three files are also referenced as file descriptors 0, 1, and 2. (Sometimes refered to as file handles.) The operating systems keeps track of open files using integers. File descriptors 0, 1, and 2 are generally reserved for those three files. When a file is opened, the lowest unused integer (i.e. 3) is used. That's why when you open the first file in your program, it will always be file descriptor 3. If you open a second file, it will be file descriptor 4, and so on.

You don't have to open these files, as they are already opened for you. You also don't have to close them, as the system will close them when the program ends. They show up as being still open because they aren't closed until long after main finishes. That means these are false-positives that we can ignore. (Also, notice that they were inherited from the parent.)

OK, let's open a file and "forget" to close it:

#include <stdio.h> /* FILE, fopen */

void test1()
{
    /* Open a file, but don't close it. */
  FILE *fp = fopen("foo.txt", "wt");
}

int main()
{
  test1();
  return 0;
}
Now, running it under Valgrind produces this output:
==4960== FILE DESCRIPTORS: 4 open at exit.
==4960== Open file descriptor 3: foo.txt
==4960==    at 0x4F26170: __open_nocancel (syscall-template.S:81)
==4960==    by 0x4EB0ED7: _IO_file_open (fileops.c:228)
==4960==    by 0x4EB0ED7: _IO_file_fopen@@GLIBC_2.2.5 (fileops.c:333)
==4960==    by 0x4EA53D3: __fopen_internal (iofopen.c:90)
==4960==    by 0x40052D: test1 (fd.c:5)
==4960==    by 0x400542: main (fd.c:10)
==4960== 
==4960== Open file descriptor 2: /dev/pts/2
==4960==    <inherited from parent>
==4960== 
==4960== Open file descriptor 1: /dev/pts/2
==4960==    <inherited from parent>
==4960== 
==4960== Open file descriptor 0: /dev/pts/2
==4960==    <inherited from parent>
You can see that there is a fourth file descriptor shown (file descriptor 3) which is the file that our program opened for writing. It also shows you the line number that opened the file. This makes it pretty easy to locate bugs (file leaks) in your program.

One last thing to mention. You'll see that the first 3 file descriptors don't reference any filenames (they use pseudo-terminals). If you were to redirect stdin, stdout, and stderr (assume in.txt exists):

valgrind -q --track-fds=yes ./fd < in.txt > out.txt 2> err.txt
you would see something like this output from Valgrind:
==10412== FILE DESCRIPTORS: 3 open at exit.
==10412== Open file descriptor 2: /home/user/data/Courses/notes/code/drmem/err.txt
==10412==    <inherited from parent>
==10412== 
==10412== Open file descriptor 1: /home/user/data/Courses/notes/code/drmem/out.txt
==10412==    <inherited from parent>
==10412== 
==10412== Open file descriptor 0: /home/user/data/Courses/notes/code/drmem/in.txt
==10412==    <inherited from parent>
Valgrind shows the actual names of the files that are used with the redirection. Pretty cool!

Summary:

As was mentioned at the top of this webpage, Valgrind can do much more than just check for memory misuse. It includes an entire suite of tools:

Check out the Valgrind User Manual for all of the details on using the suite. However, be warned, once you start using Valgrind, you won't be able to program without it!