[af4bfd0] | 1 | memlog - A Memory-Allocation Logging Tool |
---|
| 2 | |
---|
| 3 | This tool attempts to help you answer the question: |
---|
| 4 | Why is my application using so much memory? |
---|
| 5 | |
---|
| 6 | ** LINKING ** |
---|
| 7 | |
---|
| 8 | How to use it depends on how your application is linked: |
---|
| 9 | |
---|
| 10 | For dynamically-linked applications, you can: |
---|
| 11 | |
---|
| 12 | 1. Use LD_PRELOAD: Set LD_PRELOAD=/path/to/memlog/libmemlog.so when you run |
---|
| 13 | your application. |
---|
| 14 | |
---|
| 15 | 2. Link directly: Add the following to your linker flags: |
---|
| 16 | -L/path/to/memlog -Wl,-rpath,/path/to/memlog -lmemlog |
---|
| 17 | |
---|
[3c9fc94] | 18 | For statically-linked applications ld's automatic wrapping functionality is |
---|
| 19 | employed, and the exact set of necessary flags is large, so a file named |
---|
| 20 | memlog_s_ld_cmds has been provided containing the necessary flags. |
---|
[af4bfd0] | 21 | |
---|
[3c9fc94] | 22 | To your linker flags add: |
---|
| 23 | |
---|
| 24 | `cat /path/to/memlog/memlog_s_ld_cmds` |
---|
| 25 | |
---|
| 26 | or, if your compiler and wrappers support response files (gcc and clang do, for |
---|
| 27 | example), simply: |
---|
| 28 | |
---|
| 29 | @/path/to/memlog/memlog_s_ld_cmds |
---|
| 30 | |
---|
| 31 | so your overall linking command might look something like this: |
---|
| 32 | |
---|
| 33 | mpic++ -O3 -g -o my_program my_obj1.o my_obj2.o @/path/to/memlog/memlog_s_ld_cmds |
---|
[af4bfd0] | 34 | |
---|
| 35 | ** RUNNING ** |
---|
| 36 | |
---|
| 37 | When your application runs, you should find in your current directory files |
---|
| 38 | named 'HOST.PID.memlog', one for each process. These contain the raw tracing |
---|
| 39 | information, and are only somewhat human readable. You can create a ps/pdf |
---|
| 40 | file detailing the memory allocated when each process reached its peak memory |
---|
| 41 | use by running: |
---|
| 42 | |
---|
[4598848] | 43 | /path/to/memlog/memlog_analyze /path/to/HOST.PID.memlog |
---|
[af4bfd0] | 44 | |
---|
| 45 | this will generate files named HOST.PID.memlog.dot, HOST.PID.memlog.ps and |
---|
| 46 | HOST.PID.memlog.pdf. You'll probably find the pdf file most convenient for |
---|
[24aa734] | 47 | viewing. HOST.PID.memlog.txt is also generated, providing the same information |
---|
| 48 | in textual form. |
---|
[af4bfd0] | 49 | |
---|
[0109b01] | 50 | If you pass the --leaks option to memlog_analyze, it will provide data on |
---|
| 51 | allocations active at the end of the program (leaks) instead of those active |
---|
| 52 | when the peak memory usage is first reached. |
---|
| 53 | |
---|
[5df7203] | 54 | You might have many runs of the same application (or output from many ranks of |
---|
| 55 | an MPI job), and you'd like to pick the one for analysis with the highest |
---|
| 56 | memory usage. If you provide a glob pattern to memlog_analyze it will do this |
---|
| 57 | for you. Make sure you quote the glob pattern so that your shell does not |
---|
| 58 | expand it. |
---|
| 59 | |
---|
| 60 | /path/to/memlog/memlog_analyze "/path/to/*.memlog" |
---|
| 61 | |
---|
[22f928f] | 62 | When running under common batch systems, the files are named |
---|
| 63 | JOB_ID.HOST.PID.memlog, and when running under the BG/Q CNK, the process's rank |
---|
| 64 | is used instead of the node-local PID. |
---|
| 65 | |
---|
[af4bfd0] | 66 | Note that te peak memory usage is determined by monitoring the processes's |
---|
| 67 | maximum resident set size, not just the total allocated heap memory. |
---|
| 68 | |
---|
[192a260] | 69 | memlog_analyze takes, as a second optional parameter, the name of the output |
---|
| 70 | directory (the current directory is the default). If the directory does not |
---|
| 71 | exist, it will be created. |
---|
| 72 | |
---|
[4598848] | 73 | memlog_analyze depends on dot (from the graphviz package) and ps2pdf (from the |
---|
[af4bfd0] | 74 | ghostscript package), plus various tools from the binutils package. |
---|
| 75 | |
---|
| 76 | ** RELATED WORK ** |
---|
| 77 | |
---|
| 78 | Why was memlog created? There are several other tools that can support this use |
---|
| 79 | case, but none of them would work in our environment properly. They were |
---|
| 80 | either too slow, not runnable under the BG/Q CNK, not thread safe, did not |
---|
| 81 | properly support big-endian PPC64, supported only either static or dynamic |
---|
| 82 | linking, did not collect full backtraces, or just did not produce |
---|
| 83 | sufficiently-informative peak-usage output. |
---|
| 84 | |
---|
| 85 | That having been said, some other tools that might interest you: |
---|
| 86 | Valgrind Massif - http://valgrind.org/docs/manual/ms-manual.html |
---|
| 87 | Google Performance Tools - http://google-perftools.googlecode.com/svn/trunk/doc/heapprofile.html |
---|
| 88 | memtrail - https://github.com/jrfonseca/memtrail |
---|
| 89 | LeakTracer - http://www.andreasen.org/LeakTracer/ |
---|
| 90 | glibc mtrace - http://www.gnu.org/s/hello/manual/libc/Allocation-Debugging.html |
---|
| 91 | Heaptrack - http://milianw.de/blog/heaptrack-a-heap-memory-profiler-for-linux |
---|
| 92 | MemProf - http://www.secretlabs.de/projects/memprof/ |
---|
| 93 | |
---|
| 94 | The dot/pdf output produced by memlog was definitely inspired by that produced |
---|
| 95 | by Google's pprof tool in the aforementioned package. |
---|
| 96 | |
---|