In the last few weeks, I have been needing to understand a lot of existing source code. People, including myself, often try to reverse-engineer binaries and not pay much attention to reverse-engineering available code, if that’s the proper way to call it. While not reverse-engineering in its core, analyzing source code tends to tickle similar parts of the brain as when reverse-engineering binaries. This post is a short description of the tools and techniques I’ve been using and hope to receive suggestions to techniques I have been missing.
The tool in question is ltrace, which I believe is a doing of mostly the Debian project (I can be wrong), with a port to FreeBSD which is what I have been using. I have spent enough hours trying to just read source code, using grep to look for where to look for my next step.
Read on for descriptions!
GNU gprof, or GNU profiler, is a tool used to typically measure the performance and runtime (not algorithmic) of components. When using gprof, compile your target program with flags ‘-g -pg’. That will compile the program with debugging symbols, and add extra code that can generate profiling information. If you need to use gprof on a port or an existing application that uses the standard GNU building toolchain, just add ‘-g -pg’ to CFLAGS inside the Makefile. Also, look for lines where the program gets linked, as that needs ‘-g -pg’ as well. Then, use gprof to execute the program with the arguments that you will want to profile, e.g. ‘gprof ./ltrace /bin/ls 2>/dev/null > ltrace_gprof.out‘. That will create a file, ltrace_gprof.out that will contain all procedures that were executed, and all called from them. This simplifies tracing of code for specific execution of binaries. By passing the required arguments, you are crafting the execution thread to your liking. While I did not use gprof to profile, I did use it to trace the execution in a more-or-less high-level fashion.
The second tool I want to write about is GNU cflow. cflow, which I’m sure everyone except me knew about, takes a number of C files, and generates a graph of the function call hierarchy. It is very useful to see how a program executes, in a static fashion. One can just call ‘cflow -X *.c‘ and look at a pstree-like representation of the call graph. In my opinion, very useful in understanding the flow of a program. Here is some sample output:
77 fopen {}
78 fgets {}
79 process_line {read_config_file.c 112}
80 debug_ ... {62}
81 eat_spaces {read_config_file.c 70}
82 str2type {read_config_file.c 55}
83 strlen {}
84 strncmp {}
85 index {}
86 start_of_arg_sig {read_config_file.c 81}
87 strlen {}
88 output_line ... {66}
Why am I playing with these tools? To complete a piece of code that has stagnated at being at 80% completion, or so I think. More info on the tool along with complete source as I get it to some working state.












0 Responses to “Tools for implicit code understanding”
Leave a Reply