Skip to content

Commit fa19d0e

Browse files
Chris Croft-WhiteMagne Hov
Chris Croft-White
authored and
Magne Hov
committed
Add malloc free leak detector scripts
1 parent bed3570 commit fa19d0e

File tree

6 files changed

+261
-0
lines changed

6 files changed

+261
-0
lines changed

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,9 @@ Allows live-record to record both parent and child after `fork()` has been calle
2828
[**Load Debug Symbols**](load_debug_symbols/README.md)
2929
Loads debug symbols by parsing the relevant section addresses.
3030

31+
[**Malloc Free Check**](malloc_free_check/README.md)
32+
Checks that no memory was leaked in your program by tracking calls to `malloc()` and `free()`.
33+
3134
[**Reconstruct file**](reconstruct_file/README.md)
3235
Reconstructs the content of a file by analysing reads on the execution history
3336
of a debugged program or LiveRecorder recording.

malloc_free_check/Makefile

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
memchecker: memchecker.c
2+
gcc -g -o $@ $<
3+
4+
memchecker.undo: memchecker
5+
live-record -o $@ ./$<
6+
7+
run: memchecker.undo
8+
./malloc-free.py $<
9+
10+
all: memchecker.undo
11+
12+
clean:
13+
rm *.undo *.pyc memchecker

malloc_free_check/README.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Memory leak detection example
2+
This example implements a simple memory leak detector with the Undo Automation API.
3+
4+
The `memchecker.c` example application has a single unmatched `malloc()` call (excluding some from before the application has actually started, including the 1KB buffer for printfs, which are detected).
5+
6+
Using the Undo Automation API, the python scripts process the recording to find all `malloc()` and `free()` calls. The script ignores all `malloc()` calls with matching `free()` call, and after parsing the entire recording, jumps back in time to each of the unmatched `malloc()` calls. For each call, the scripts:
7+
* Output the backtrace at the time of the call.
8+
* Continue execution until `malloc()` returns.
9+
* Outputs the souce code for the calling function (if available) and locals.
10+
11+
In the case of the example program, this is sufficient to clearly show the root cause for the deliberate leak. Generally it should give a good hint for other recordings, and the output does clearly provide the timestamps for the `malloc()` calls to enable opening the recording and jumping directly to the leaking memory allocation to start debugging from there.
12+
13+
These scripts can be used as a starting point to implement other kinds of analysis related to the standard allocation functions, such as producing a profile of how much memory is being used during execution.
14+
15+
## How to run the demo
16+
Simply enter the directory and run:
17+
18+
`make run`
19+
20+
## How to use the scripts on other recordings
21+
Simply run the `malloc-free.py` script, passing the recording as the parameter:
22+
23+
`./malloc-free.py <recording.undo>`
24+
25+
## Enhancements ideas
26+
* Provide some way to filter out library code.
27+
* Add verbosity controls.
28+
* Support recordings without symbols (provide address for `malloc()` & `free()` at command line).
29+
* Automatically trace the use of leaking memory to identify the last read or write access to the memory.

malloc_free_check/malloc-free.py

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
#! /usr/bin/env udb-automate
2+
"""
3+
Undo Automation command-line script for tracking calls to malloc() and free() and checking for
4+
leaked memory.
5+
6+
This script only support the x86-64 architecture.
7+
8+
Contributors: Chris Croft-White, Magne Hov
9+
"""
10+
11+
import sys
12+
import textwrap
13+
14+
from undo.udb_launcher import REDIRECTION_COLLECT, UdbLauncher
15+
16+
17+
def main(argv):
18+
# Get the arguments from the command line.
19+
try:
20+
recording = argv[1]
21+
except ValueError:
22+
# Wrong number of arguments.
23+
print(f"{sys.argv[0]} RECORDING_FILE", file=sys.stderr)
24+
raise SystemExit(1)
25+
26+
# Prepare for launching UDB.
27+
launcher = UdbLauncher()
28+
# Make UDB run with our recording.
29+
launcher.recording_file = recording
30+
# Make UDB load the malloc_free_check_extension.py file from the current directory.
31+
launcher.add_extension("malloc_free_check_extension")
32+
# Finally, launch UDB!
33+
# We collect the output as, in normal conditions, we don't want to show it
34+
# to the user but, in case of errors, we want to display it.
35+
res = launcher.run_debugger(redirect_debugger_output=REDIRECTION_COLLECT)
36+
37+
if res.exit_code == 0:
38+
# All good as UDB exited with exit code 0 (i.e. no errors).
39+
# The result_data attribute is used to pass information from the extension to this script.
40+
unmatched = res.result_data["unmatched"]
41+
print(f"The recording failed to free allocated memory {unmatched} time(s).")
42+
else:
43+
# Something went wrong! Print a useful message.
44+
print(
45+
textwrap.dedent(
46+
f"""\
47+
Error!
48+
UDB exited with code {res.exit_code}.
49+
50+
The output was:
51+
52+
{res.output}
53+
"""
54+
),
55+
file=sys.stderr,
56+
)
57+
# Exit this script with the same error code as UDB.
58+
raise SystemExit(res.exit_code)
59+
60+
61+
if __name__ == "__main__":
62+
main(sys.argv)
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
"""
2+
Undo Automation extension module for tracking calls to malloc() and free() and checking for
3+
leaked memory.
4+
5+
This script only support the x86-64 architecture.
6+
7+
Contributors: Chris Croft-White, Magne Hov
8+
"""
9+
10+
import collections
11+
import re
12+
13+
import gdb
14+
15+
from undodb.debugger_extensions import udb
16+
from undodb.debugger_extensions.debugger_io import redirect_to_launcher_output
17+
18+
19+
def leak_check():
20+
"""
21+
Implements breakpoints and stops on all calls to malloc() and free(), capturing the
22+
timestamp, size and returned pointer for malloc(), then confirms the address pointer is later
23+
seen in a free() call.
24+
25+
If a subsequent free() is not seen, then at the end of execution, output the timestamp and
26+
details of the memory which was never freed.
27+
"""
28+
# Set a breakpoint for the specified function.
29+
gdb.Breakpoint("malloc")
30+
gdb.Breakpoint("free")
31+
32+
# Declare allocations dictionary structure.
33+
allocations = collections.OrderedDict()
34+
35+
# Do "continue" until we have gone through the whole recording, potentially
36+
# hitting the breakpoints several times.
37+
end_of_time = udb.get_event_log_extent().end
38+
while True:
39+
gdb.execute("continue")
40+
41+
# Rather than having the check directly in the while condition we have
42+
# it here as we don't want to print the backtrace when we hit the end of
43+
# the recording but only when we stop at a breakpoint.
44+
if udb.time.get().bbcount >= end_of_time:
45+
break
46+
47+
# Use the $PC output to get the symbol and idenfity whether execution has stopped
48+
# at a malloc() or free() call.
49+
mypc = format(gdb.parse_and_eval("$pc"))
50+
if re.search("malloc", mypc):
51+
# In malloc(), set a FinishBreakpoint to capture the pointer returned later.
52+
mfbp = gdb.FinishBreakpoint()
53+
54+
# For now, capture the timestamp and size of memory requested.
55+
time = udb.time.get()
56+
size = gdb.parse_and_eval("$rdi")
57+
58+
gdb.execute("continue")
59+
60+
# Should stop at the finish breakpoint, so capture the pointer.
61+
addr = mfbp.return_value
62+
63+
if addr:
64+
# Store details in the dictionary.
65+
allocations[format(addr)] = time, size
66+
else:
67+
print(f"-- INFO: Malloc called for {size} byte(s) but null returned.")
68+
69+
print(f"{time}: malloc() called: {size} byte(s) allocated at {addr}.")
70+
71+
elif re.search("free", mypc):
72+
# In free(), get the pointer address.
73+
addr = gdb.parse_and_eval("$rdi")
74+
75+
time = udb.time.get()
76+
77+
# Delete entry from the dictionary as this memory was released.
78+
if addr > 0:
79+
if allocations[hex(int(format(addr)))]:
80+
del allocations[hex(int(format(addr)))]
81+
else:
82+
print("--- INFO: Free called with unknown address")
83+
else:
84+
print("--- INFO: Free called with null address")
85+
86+
#with redirect_to_launcher_output():
87+
print(f"{time}: free() called for {int(addr):#x}")
88+
89+
90+
# If Allocations has any entries remaining, they were not released.
91+
with redirect_to_launcher_output():
92+
print ()
93+
print (f"{len(allocations)} unmatched memory allocation(s):")
94+
print ()
95+
96+
total = 0
97+
98+
# Increase the amount of source from default (10) to 16 lines for more context.
99+
gdb.execute("set listsize 16")
100+
for addr in allocations:
101+
time, size = allocations[addr]
102+
total += size
103+
print("===============================================================================")
104+
print(f"{time}: {size} bytes was allocated at {addr}, but never freed.")
105+
print("===============================================================================")
106+
udb.time.goto(time)
107+
print("Backtrace:")
108+
gdb.execute("backtrace")
109+
print()
110+
print("Source (if available):")
111+
gdb.execute("finish")
112+
gdb.execute("list")
113+
print()
114+
print("Locals (after malloc returns):")
115+
gdb.execute("info locals")
116+
print()
117+
print()
118+
print("===============================================================================")
119+
print(f" In total, {total} byte(s) were allocated and not released")
120+
print()
121+
122+
return len(allocations)
123+
124+
125+
# UDB will automatically load the modules passed to UdbLauncher.add_extension and, if present,
126+
# automatically execute any function (with no arguments) called "run".
127+
def run():
128+
# Needed to allow GDB to fixup breakpoints properly after glibc has been loaded
129+
gdb.Breakpoint("main")
130+
131+
unmatched = leak_check()
132+
133+
# Pass the number of time we hit the breakpoint back to the outer script.
134+
udb.result_data["unmatched"] = unmatched

malloc_free_check/memchecker.c

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
#include <stdio.h>
2+
#include <stdlib.h>
3+
4+
int
5+
main(void)
6+
{
7+
int i;
8+
9+
for (i = 1; i < 20; ++i)
10+
{
11+
int *addr = (int *)malloc(10 * sizeof(int));
12+
printf("Address allocated: %p\n", addr);
13+
14+
if (!(i % 10 == 0))
15+
{
16+
printf("Address freed: %p\n", addr);
17+
free(addr);
18+
}
19+
}
20+
}

0 commit comments

Comments
 (0)